Symmetry Breaking in SAT Solving

. SAT solvers are now widely used to solve a large variety of problems, including formal veriﬁcation of systems. SAT problems derived from such applications often exhibit symmetry properties that could be exploited to speed up their solving. Static symmetry breaking is so far the most popular approach to take advantage of symmetries. It relies on a symmetry preprocessor which augments the initial problem with constraints that force the solver to consider only a few conﬁgurations among the many symmetric ones. This paper presents a new way to handle symmetries, that avoid the main problem of the current static approaches: the prohibitive cost of the preprocessing phase. Our proposal has been implemented in MiniSym . Extensive experiments on the benchmarks of last six SAT competitions show that our approach is competitive with the best state-of-the-art static symmetry breaking solutions.


Introduction
Nowadays, Boolean satisfiability (SAT) is an active research area finding its applications in many contexts such as planning decision [14], hardware and software verification [3], cryptology [19], computational biology [17], etc.Hence, the development of approaches that could treat increasingly challenging SAT problems has become a focus.
State-of-the-art complete solvers of SAT problems are based on the wellknown Conflict Driven Clauses Learning (CDCL) algorithm [18], itself inspired from the Davis-Putnam-Logemann-Loveland algorithm [6].These are complete backtracking based search algorithms that welcome any heuristic/optimisation pruning of parts of the explored search tree.In this paper, we are interested in exploiting the symmetry properties of SAT problems to perform such a pruning.
Symmetries in SAT Solving.SAT problems often exhibit symmetries 1 , and not taking them into account forces solvers to needlessly explore isomorphic parts of the search space.
For example, the "pigeonhole problem" (where n pigeons are put into n − 1 holes, with the constraint that each pigeon must be in a different hole) is a highly symmetric problem.Indeed, all the pigeons (resp.holes) are swappable without changing the initial problem.Trying to solve it with a standard SAT solver, like MiniSAT [10], turns out to be very time consuming (and even impossible, in reasonable time, for high values of n).Here, such a standard solver ignores the symmetry property of the problem, and then potentially tries all variables combinations; this eventually leads to a combinatorial explosion.
Symmetries of a SAT problem are classically obtained through a reduction to an equivalent graph automorphism problem.Technically, the SAT problem is converted to a colored graph, then it is passed to a tool, like saucy3 [13] or bliss [12], to compute its automorphism group.
A common approach to exploit such symmetries is to pre-compute and enrich the original SAT problem with symmetry breaking predicates (sbp).These added predicates will prevent the solver from visiting equivalent (isomorphic) parts that eventually yield the same results [1,5].This technique, called static symmetry breaking, has been implemented first in the state-of-the-art tool SHATTER [2] and then improved in BREAKID [8].However, while giving excellent results on numerous symmetric problems, these approaches still fail to solve some classes of symmetric problems.
Another class of approaches exists, known as dynamic symmetry breaking techniques.They intervene directly during the search exploration.It concerns, to mention but a few, the injection of symmetric versions of learned clauses [7,21], particular classes of symmetries [20], or speeding up the search by inferring symmetric facts [9].These approaches succeeded in treating particular and hand crafted problems but, to the best of our knowledge, none of them is competitive face to the static symmetry breaking methods.

Drawbacks of the Static-Based Approaches.
In the general case, the size of the sbp can be exponential in the number of variables of the problem so that they cannot be totally computed.Even in more favorable situations, the size of the generated sbp is often too large to be effectively handled by a SAT solver [15].On the other hand, if only a subset of the symmetries is considered then the resulting search pruning will not be that interesting and its effectiveness depends heavily on the heuristically chosen symmetries [4].Besides, these approaches are preprocessors, so their combination with other techniques, such as symmetry propagation [9], can be very hard.Also, tuning their parameters during the solving turns out to be very difficult.For all these reasons, some classes of SAT problems cannot be solved yet despite exhibiting symmetries.
Proposed Solution.To handle these issues, we propose a new approach that reuses the principles of the static approaches, but operates dynamically: the symmetries are broken during the search process without any pre-generation of the sbp.To do so, we elaborate the notions of symmetry status tracking and effective symmetric breaking predicates (esbp).
The approach is implemented using a couple of components: (1) a Conflict Driven Clauses Learning (CDCL) search engine; (2) a symmetry controller.Roughly speaking, the first component performs the classical search activity on the SAT problem, while the second observes the engine and maintains the status of the symmetries.When the controller detects a situation where the engine is starting to explore a redundant part2 , it orders the engine to operate a backjump.The detection is performed thanks to symmetry status tracking and the backjump order is given by a simple injection of an esbp computed on the fly.
The main advantage of such an approach is to cope with the heavy (and potentially blocking) pre-generation phase of the static-based approaches, but also offers opportunities to combine with other dynamic-based approaches, like the symmetry propagation technique [9].It also gives more flexibility for adjusting some parameters on the fly.Moreover, the overhead for non symmetric formulas is reduced to the computation time of the graph automorphism.
The extensive evaluation of our approach on the symmetric formulas of the last six SAT contests shows that it outperforms the state-of-the-art techniques, in particular on unsatisfiable instances, which are the hardest class of the problem.
Content of the Paper.The remainder of the paper is organized as follows.Section 2 is dedicated to preliminaries and definitions.Section 3 discusses the details of our CDCLSym algorithm.Section 4 highlights our tooling support and evaluations.Section 5 concludes this work and gives directions for future work.

Preliminaries and Definitions
This section introduces some definitions.First, we define the problem of Boolean satisfiability.Then, we introduce the notions of ordering and monotonicity that provide a lexicographical order to assignments.These are central concepts to the definition of a representative assignment.
Finally, we introduce two core notions that are required to define our new algorithm: (i) Reducer, inactive and active permutation, and (ii) the effective symmetry breaking predicates (esbp).

Basics on Boolean Satisfiability
A Boolean variable, or propositional variable, is a variable that has two possible values: true or false (noted or ⊥, respectively).A literal l is a propositional variable or its negation.For a given variable x, the positive literal is represented by x and the negative one by ¬x.A clause ω is a finite disjunction of literals represented equivalently by ω = k i=1 l i or the set of its literals ω = {l i } i∈ 1,k .A clause with a single literal is called unit clause.A conjunctive normal form (CNF) formula ϕ is a finite conjunction of clauses.A CNF can be either noted ϕ = k i=1 ω i or ϕ = {ω i } i∈ 1,k .We denote V ϕ (L ϕ ) the set of variables (literals) used in ϕ (the index in V ϕ and L ϕ is usually omitted when clear from context).
For a given formula ϕ, an assignment of the variables of ϕ is a function α : V → { , ⊥}.As usual, α is total, or complete, when all elements of V have an image by α, otherwise it is partial.By abuse of notation, an assignment is often represented by the set of its true literals.The set of all (possibly partial) assignments of V is noted Ass(V).
The assignment α satisfies the clause ω, denoted α |= ω, if α ∩ ω = ∅.Similarly, the assignment α satisfies the propositional formula ϕ, denoted α |= ϕ, if α satisfies all the clauses of ϕ.Note that a formula may be satisfied by a partial assignment.A formula is said to be satisfiable (sat) if there is at least one assignment that satisfies it; otherwise the formula is unsatisfiable (unsat).
and is reported to be sat.Note that the assignment α, making ϕ sat, does not need to be complete because x 3 is a don't care variable with respect to α.

Ordering and Monotonicity
In order to exploit the symmetry properties of a SAT problem, we need to introduce an ordering relation between the assignments.Definition 1 (Assignments ordering).We assume a total order, ≺, on V. Given two assignments (α, β) ∈ Ass(V) 2 , we say that α is strictly smaller than β, noted α < β, if there exists a variable v ∈ V such that: -¬v ∈ α and v ∈ β. 3   Note that < coincides with the lexicographical order on complete assignments.Furthermore, the < relation is monotonic as expressed in the following proposition.
It is worth noting that this last proposition is the key property for the efficient implementation of our algorithm.

Symmetry Group of a Formula
The group of permutations of V (i.e.bijections from V to V) is noted S(V).The group S(V) naturally acts on the set of literals: for g ∈ S(V) and a literal ∈ L, g. = g( ) if is a positive literal, g. = ¬g(¬ ) if is a negative literal.The group S(V) also acts on (partial) assignments of V as follows: for g ∈ S(V), α ∈ Ass(V), g.α = {g.| ∈ α}.Let ϕ be a formula, and g ∈ S(V).We say that Let G be a subgroup of S(V).The orbit of α under G (or simply the orbit of The lexicographic leader (lex-leader for short) of an orbit [α] G is defined by min < ([α] G ).This lex-leader is unique because the lexicographic order is a total order.
The optimal approach to solve a symmetric SAT problem would be to explore only one assignment per orbit (for instance each lex-leader ).However, finding the lex-leader of an orbit is computationally hard [16].
What we propose here is a best effort approach that tries to eliminate, dynamically, the non lex-leading assignments with a minimal computation effort.To do so, we first introduce the notions of reducer, inactive and active permutation with respect to an assignment α.

Definition 2 (Reducer, inactive and active permutation).
A permutation g is a reducer of an assignment α if g.α < α (hence α cannot be the lex-leader of its orbit.g reduces it and all its extensions).g is inactive on α when α < g.α (so, g cannot reduce α and all the extensions).A symmetry is said to be active with respect to α when it is neither inactive nor a reducer of α.
Proposition 2 restates this definition in terms of variables and is the basis of an efficient algorithm to keep track of the status of a permutation during the solving.Let us, first, recall that the support, V g , of a permutation g is the set Proposition 2. Let α ∈ Ass(V) be an assignment, g ∈ S(V) a permutation and V g ⊆ V the support of g.We say that g is: 1. a reducer of α if there exists a variable v ∈ V g such that: - When g is a reducer of α we can define a predicate that contradicts α yet preserves the satisfiability of the formula.Such a predicate will be used to discard α, and all its extensions, from a further visit and hence pruning the search tree.

Definition 3 (Effective Symmetry Breaking Predicate).
Let α ∈ Ass(V), and g ∈ S(V).We say that the formula ψ is an effective symmetry breaking predicate (esbp for short) for α under g if: The next definition gives a way to obtain such an effective symmetry-breaking predicate from an assignment and a reducer.

Definition 4 (A construction of an esbp).
Let ϕ be a formula.Let g be a symmetry of ϕ that reduces an assignment α.Let v be the variable whose existence is given by item 1. in Proposition 2. Let U = {v , ¬v | v ∈ V g and v v}.We define η(α, g) as ), and a partial assignment We denote a α and β as the restrictions of α and β to the variables in {v ∈ V g | v v}.Since β ∧ η(α, g) is unsat, α = β .But g.α < α , and g.β < β .By monotonicity of <, we thus also have g.β < β.
It is important to observe that the notion of ebsp is a refinement of the classical concept of sbp defined in [2].In particular, like sbp, esbp preserve satisfiability.
Proof.If ϕ ∧ ψ is SAT then ϕ is trivially SAT.If ϕ is SAT, then there is some assignment β that satisfies ϕ.Without loss of generality, β can be chosen to be the lex-leader of its orbit under S(ϕ).Thus, g does not reduce β, which implies that β |= ψ.

CDCLSym Algorithm
This section describes how to augment the state-of-the-art CDCL algorithm with the aforementioned concepts to develop an efficient symmetry-guided SAT solving algorithm.We first recall how the CDCL algorithm works.We then explain how to extend it with a symmetry controller component which guides the behavior of CDCL algorithm depending on the status of symmetries.

Classical CDCL
A Conflict-Driven Clause Learning (CDCL) algorithm is depicted in Algorithm 1.The parts in red (grey in B&W printings) should be ignored for the moment.
The algorithm walks a binary search tree.It first applies unit propagation to the formula ϕ for the current assignment α (line 4).A conflict at level 0 indicates that the formula is not satisfiable, and the algorithm reports it (lines 8-9).If a conflict is detected, it is analyzed, which provides a conflict clause explaining the reason for the conflict (line 11).This clause is learnt (line 14), as it does not change the satisfiability of ϕ, and avoids encountering a conflict with the same causes in the future.The analysis is completed by the computation of a backjump point to which the algorithm backtracks (line 15).Finally, if no conflict appears, the algorithm chooses a new decision literal (line [18][19].The above steps are repeated until the satisfiability status of the formula is determined.
It is out of the scope of this paper to detail the existing variations for the conflict analysis and for the decision heuristic.

Symmetry-Guided Search
As explained earlier, the main problem of the static approaches is that they generate many sbp that are not effective in the solving (size of the generated formulas, overburden of the unit propagation procedure, etc.).
The idea we bring is to break symmetries on the fly: when the current partial assignment can not be a prefix of a lex-leader (of an orbit), an esbp (see Definition 3) that prunes this forbidden assignment and all its extensions is generated.
We implement this approach using two components that communicate with each other: the SAT-solving engine itself, and a symmetry controller.The symmetry controller is initially given a set of symmetries G4 .It observes the behavior of the SAT engine and updates its internal data according to the current assignment, to keep track of the status of the symmetries.This observation is incremental : whenever a literal is assigned or cancelled, the symmetry controller updates the status of all the symmetries.This corresponds to lines 5 and 16 of Algorithm 1.When the controller detects that the current assignment can not be a lex-leader (line 6), it generates the corresponding esbp (line 13).
In the remainder of this section, we detail the functions composing the symmetry controller.

Symmetries Status
Tracking.The updateAssign, updateCancel and isNotLexLeader functions (see Algorithm 2) track the status of symmetries based on Proposition 2; there, resides the core of our algorithm.
All these functions rely on the pt structure: a map of variables indexed by permutations.Initially, pt[g] = min(V g ) for all g ∈ G and all permutations are marked active.
For each permutation, g, the symmetry controller keeps track of the smallest variable pt[g] in the support of g such that pt[g] and g −1 (pt[g]) do not have the same value in the current assignment.If one of the two variables is not assigned, they are considered not to have the same value.
When new literals are assigned, only active symmetries need to have their pt[g] updated (line 2).This update is done thanks to a while loop (lines 4-5).
When literals are cancelled, we need to update the status of symmetries for which some variable v before pt[g], or g −1 (v), becomes unassigned (lines 9-10).Symmetries that were inactive may be reactivated (line 11).
The current assignment is not a lex-leader if some symmetry g is a reducer.This is detected by comparing the value of pt[g] with the value of g −1 (pt[g]) (line 16).The function isNotLexLeader also marks symmetries as inactive when appropriate (lines [18][19]. Generation of the esbp.When the current assignment cannot be a lexleader, some symmetry g is a reducer.The function generateEsbp computes the 1 function updateAssign(α: assignment) The functions keeping track of the status of the symmetries and generating the esbp.
η(α, g) defined in Definition 4, which is an effective symmetry-breaking predicate by Proposition 3.This will prevent the SAT engine to explore further the current partial assignment.

Lex-leader Forcing
Our algorithm prevents as much as possible the solver from visiting non lexleaders assignments.To do so, we propose an additional heuristic that delays the visit of non lex-leaders partial assignments.
A way to prevent α from becoming a non lex-leader is to force the literal g −1 (v) into α.This can be easily done by learning η(α , g) when the current assignment is α.The same reasoning holds when ¬g −1 (v) ∈ α and v ∈ α.

Implementation and Evaluation
In this section, we first highlight some details on our implementation of the symmetry controller.Then, we experimentally assess the performance of our algorithm against three other state-of-the-art tools.

cosy: An Efficient Implementation of the Symmetry Controller
We have implemented our method in a C++ library called cosy (1630 LoC).It implements a symmetry controller as described in the previous section, and can be interfaced with virtually any CDCL SAT solver.cosy is released under GPL v3 licence and is available at https://github.com/lip6/cosy.

Heuristics and Options.
Let us recall that finding the optimal ordering of variables (with respect to the exploitation of symmetries) is NP-hard [15], so the choice for this ordering is heuristic.cosy offers several possibilities to define this ordering: -a naive ordering, where variables are ordered by the lexicographic order of their names; -an ordering based on occurrences, where variables are sorted according to the number of times they occur in the input formula.The lexicographic order of variables names is used for those having the same number of occurrences; -an ordering based on symmetries, where variables belonging to the same orbit (under the given set of symmetries) are grouped together.Orbit are ordered by their numbers of occurrences.
The ordering of assignments we use in this paper orders negative literals before positive ones (thus, {¬v} < {v}), but using the converse ordering does not change the overall method.However, it can impact the performance of the solver on some instances, so that it is an option of the library.
All the symmetries we used for the presentation of our approach are permutations of variables.Our method straightforwardly extends to permutations of literals, also known as value permutations [4].Another option allows to activate the lex-leader forcing described in Sect.3.3.

Integration in
MiniSAT.We show how to integrate cosy to an existing solver, through example of MiniSAT [10].
First, we need an adapter that allows the communication between the solver and cosy (30 LoC).Then, we adapt Algorithm 1 to the different methods and functions of MiniSAT.In particular, the function updateAssign is moved into the uncheckEnqueue function of MiniSAT (2 LoC).The updateCancel function is moved to the cancelUntil function of MiniSAT that performs the backjumps (2 LoC).The isNotLexLeader and generateEsbp functions are integrated in the propagate function of MiniSAT (30 LoC).This is to keep track of the assignments as soon as they occur, then the esbp is produced as soon as an assignment is identified as not being lex-leader.Initialization issues are located in the main function of MiniSAT (15 LoC).
The integration of cosy increases MiniSAT code by 3%.

Evaluation
This section presents the evaluation of our approach.All experiments have been performed with our modified MiniSAT called MiniSym.The symmetries of the SAT problem instances have been computed by two different state-of-the-art tools saucy3 [13] and bliss [12].For a given group of symmetries, the first tool generates less permutations to represent the group than the second one, but it is slower than the other one.We selected from the last six editions of the SAT contests [11], the CNF instances for which bliss finds at least 2% of the variables are involved in some symmetries that could be computed in at most 1000 s of CPU time.We obtained a total of 1350 symmetric instances (discarding repetitions) out of 3700 instances in total.
All experiments have been conducted using the following conditions: each solver has been run once on each instance, with a time-out of 5000 s (including the execution time of the symmetries generation except for MiniSAT) and limited to 8 GB of memory.Experiments were executed on a computer with an Intel Xeon X7460 2.66 GHz featuring 24 cores and 128 GB of memory, running a Linux 4.4.13,along with g++ compiler version 6.3.
We compare MiniSym using the occurrence order, value symmetries, and without lex-leader forcing, against: -MiniSAT, as the reference solver without symmetry handling [10]; -Shatter, a symmetry breaking preprocessor described in [2], coupled with the MiniSAT SAT engine; -breakID, another symmetry breaking preprocessor, described in [8], also coupled with the MiniSAT SAT engine.
Each sat solution was successfully checked against the initial CNF.For situations, there is no way to provide an unsat certificate in presence of symmetries.Nevertheless, we checked our results were also computed by the other measured tools.Unfortunately, out of the 1350 benchmarked formulas, we have no proof or evidence for the 15 unsat formulas computed by MiniSym only.
Results are presented in Tables 1, 2, and 3. We report the number of instances solved within the time and memory limits for each solver and category.We separate the UNSAT instances (Table 1) from the SAT ones (Table 2).Besides the reference with no symmetry (column MiniSAT), we have compared the performance of the three tools when using symmetries computed by saucy3 (see Tables 1a and 2b), and bliss (see Tables 1a and 2b).Rows correspond to groups of instances: from each edition of the SAT contest, and when possible, we separated applicative instances (app x where x indicates the year) from hard combinatorial ones (hard x ).This separation was not possible for the editions 2015 and 2017 (all2015 and all2017).The total number of instances for each bench is indicated between parentheses.For each row, the cells corresponding to the tools solving the most instances (within time and memory limits) are typeset in bold and greyed out.Table 3 shows the cumulative and average PAR-2 times of the evaluated tools.We observe that MiniSym with saucy3 solves the most instances in only half of the unsat categories.However, with bliss, MiniSym solves the most instances in all but four of the unsat categories; it then also solves the highest number of instances among its competitors.This shows the interest of our approach for unsat instances.Since symmetries are used to reduce the search space, we were expecting that it will bring the most performance gain for unsat instances.
The situation for sat instances is more mitigated (Table 2), especially when using saucy3.Again, this is not very surprising: our method may cut the exploration of a satisfying assignment because it is not a lex-leader.This delays the discovery of a satisfying assignment.The other tools suffer less from such a delay, because they rely on symmetry breaking predicates generated in a pre-processing step.Also, when seeing the global results of MiniSAT, we can globally state that the use of symmetries in the case of satisfiable instances only offers a marginal improvement.
We observe that performances our tool are better with bliss than with saucy3 (see Fig. 1).We explain it as follows: saucy3 is known to compute fewer generators for the group of symmetries than bliss.Since, the larger the symmetries set is, the earlier the detection of an evidence that an assignment is not a lex-leader will be, we generate less symmetry-breaking predicates (only the effective ones).This is shown in Table 4; MiniSym generates an order of magnitude fewer predicates than breakID.
We also conducted experiments on highly symmetrical instances (all variables are involved in symmetries), whose results are presented in Table 5.The performance of breakID on this benchmark is explained by a specific optimization for the total symmetry groups that are found in these examples, that is neither implemented in Shatter nor in MiniSym.However, the difference between breakID and MiniSym is rather thin when using bliss.Our tool still outperforms Shatter on this benchmark.

Conclusion
This paper presented an approach dealing with the symmetries when they appear in SAT problems.It borrows from the state-of-the-art static-based approaches their basic principle, i.e., the adding of symmetry breaking predicates to the original problem, but performed in an incremental and dynamic way.This is possible thanks to the dynamic tracking of symmetries status and on-the-fly generation of effective symmetry breaking predicates.
Our approach outperforms other state-of-the-art static methods, as shown by an extensive evaluation on the symmetric problems gathered from the last six SAT competitions.
This approach is implemented in the C++ library called cosy.It is an off-theshelf component that can be interfaced with virtually any CDCL SAT solver.cosy is released under GPL licence and is available at https://github.com/lip6/cosy.
We now plan to focus on combining our approach with symmetry propagation [9].It seems that such a combination could be implemented thanks to minor changes on our algorithm.This would allow to integrate the acceleration mechanisms provided by the symmetry propagation, therefore obtaining a better pruning of the search three.
Another track for future work, is to evaluate the possibility of changing the order of variables dynamically: for example, following the order used by the solver when it chooses its decision variables.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material.If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Table 1 .
Comparison of different approaches on the unsat instances of the benchmarks of the six last editions of the SAT competition.

Table 2 .
Comparison of different approaches on the sat instances of the benchmarks of the six last editions of the SAT competition.

Table 3 .
Comparison of PAR-2 times (in seconds) of the benchmarks on the six last editions of the SAT competition.

Table 4 .
Comparison of the number of generated SBPs each time breakID and MiniSym both compute a verdict (number of verdicts between parentheses).

Table 5 .
Comparison of the tools on 99 highly symmetric unsat problems.