On Incremental Pre-processing for SMT

. We introduce a calculus for incremental pre-processing for SMT and instantiate it in the context of z3. It identifies when powerful formula simplifications can be retained when adding new constraints. Use cases that could not be solved in incremental mode can now be solved incrementally thanks to the availability of pre-processing. Our approach admits a class of transformations that preserve satisfiability, but not equivalence. We establish a taxonomy of pre-processing techniques that distinguishes cases where new constraints are modified or constraints previously added have to be replayed. We then justify the soundness of the proposed incremental pre-processing calculus.


Introduction
Pre-processing is a central ingredient for scaling automated deduction.These techniques apply targeted global simplification steps that can drastically reduce the complexity of problems before search techniques that use mainly local inference steps are invoked.They are used across several solver domains, spanning SAT, to SMT, first-order automated theorem proving, constraint programming, and integer programming.With the exception of SAT solvers, prior techniques do not combine well when new constraints are added incrementally to a preprocessed state.Solvers have the option to restart pre-processing from scratch.This model is viable if the overall number of solver calls is small compared to time spent solving, but is not practical for scenarios where many minor variations of a set of main constraints are queried.Such scenarios may be found in applications of dynamic symbolic execution or symbolic model checking.
A procedure to incorporate pre-and in-processing techniques [27] into incremental SAT solvers was introduced in [18], where such incremental in-processing allowed a dramatic improvement in the performance of bounded model checking applications.In the case of SAT, the effect of a simplification step is recorded in a reconstruction stack.Each eliminated clause is saved on that stack together with a partial assignment, called its witness, that is used to show the redundancy of the eliminated clause.For example, the redundancy of blocked clauses are witnessed by their blocked literal, a literal that upon all resolvents are tautological [26,32].The reconstruction stack has two very important roles in SAT solvers.First of all, it has all the information that is necessary for model reconstruction [25].When the elimination of a clause is not model-preserving, its witness on the stack tells how to modify or extend any found solution of the simplified formula such that it then satisfies the removed clause as well.Beyond that, the reconstruction stack allows to recognize all those previous simplification steps that are potentially invalidated by an incrementally added new constraint.For example, literals that were blocked in the global state of the previous clauses might not be blocked any more in the presence of some new constraints.Finding these clauses and their cone of influence on the reconstruction stack allows to undo only the problematic previous simplification steps, thereby allows pre-and in-processing to be incremental [18].
Motivated by incremental in-processing SAT solvers, our goal here is to pave a path towards a similar mechanism in the context of SMT solvers.However, SMT problems extend propositional SAT formulas in several dimensions: the base theory of SMT is the theory of equality over uninterpreted functions and predicates, SMT formulas may contain quantifiers, and constants and functions that have interpretations over theories.Concrete cases of incremental SMT pre-processing was considered in [19].While most of the formula simplification techniques of SAT solvers are captured by well studied redundancy properties [23], such a unified understanding and description of SMT pre-processing techniques is not yet introduced.Though some redundancy notions of SAT solvers can be directly embedded or generalized to SMT [30], a notion that appears to capture simplifications in SMT in many cases is that of a substitution: an uninterpreted constant or function is defined into a solved form and the constraints are simplified based on the solution.When new constraints, containing the solved function symbols, are added after pre-processing, our method distinguishes between simplifications that allow applying the substitution to the new formula or removing the substitution and re-adding the old constraints that were simplified.We have found it useful to characterize pre-processing simplifications by the following categories.
Equivalence Preserving Simplifications Many simplification methods are based on equivalence preserving simplifications.For example x > x − y + 1 simplifies to y > 1.They are automatically incremental by virtue of not changing the set of models.Developing equivalence preserving simplifications is a significant area of research and engineering by itself.A good example is using and-inverter graphs (AIGs) for simplifying propositional and first-order formulas [45,24].The main challenge with developing equivalence preserving simplifications in an incremental setting is to make them efficient.The equality x → y + 1 is used in a model converter to establish the original model.Some pre-processing techniques translate constraints from one domain to another.For example, formulas over bounded integers can be solved by translation into bit-vectors.This translation can be described with a set of equalities where bounded integers are solved for their bit-vector representation (see later an example in Table 1).

Under Constrained Simplifications
The rigid constrained simplifications already cover a significant class of pre-processing methods.Allowing incrementally solving for variables has a profound practical effect on using z3 incrementally in user scenarios.There is however a larger class of simplifications that also allow eliminating variables but do not preserve solutions to the eliminated variable.These simplifications have the same or more solutions for symbols in the original formula and we call them under-constrained.For example, the formula ((x ≃ y ∧ y < z + u) ∨ y ≥ z • u) contains x in only one position.It can be replaced by the formula ((b ∧ y < z + u) ∨ y ≥ z • u) where b is fresh.Similarly introducing definitions of fresh symbols does not eliminate solutions to symbols in the original formula.Lastly, when removing redundant clauses, the new formula may have more solutions.Tseitin transformation introduces definitions that allow removing redundant, non-CNF, formulas.
Over Constrained Simplifications Symmetry reduction [38,14] and strengthening using propagation redundancy criteria [37] are prominent examples of simplifications that apply strengthening to reduce the search space.These transformations are not covered by the classes covered by our main result.We leave it to future work to examine whether or how to incorporate strengthening: one avenue is to leverage assumption literals [16] to temporarily enable strengthenings either as part of pre-processing or during search [39].
Table 1 summarizes the main categories of pre-processing techniques discussed so far.This paper develops a calculus of incremental pre-processing for rigid constrained, under-constrained, clause elimination, and introduction of definitions.However, it does not discuss further over-constrained simplifications.
In this paper we introduce the concept of simplification modulo substitutions and show that the main SMT pre-processing methods maintain such a property.Based on that, we show how to apply or revert the effect of previous preprocessing steps when new formulas are added after simplification.

Preliminaries
We assume the usual notions of first-order logic with equality, satisfiability, logical consequence and theory, as described e.g. in [17].An interpretation M for a signature Σ (or Σ-model) consists of a non-empty set U M called the universe of the model, and a mapping ( ) M assigning to each variable and constant symbol an element of U M , to each n-ary function symbol f in Σ an n-ary function f M from U n M to U M , and to each n-ary predicate symbol p in Σ an n-ary function x ≤ y ≤ z ε p(x), p(y), p(z) Table 1.Main categories of pre-processing techniques found in SMT solvers.Function ite is an abbreviation for if-then-else and bv2int is a function that maps a bit-vector to an integer value.
from the set U n M to distinguished values representing true and false.Note that to keep the presentation simple, we only consider a single universe in the models.Interpretations extend to terms by composition.
We use the terminology symbols referring to uninterpreted symbols (variables) and function symbols.Given a model M and a symbol x, the model M[x → a] is exactly the same as M, except that x M = a where a ∈ U M for 0-ary symbols and a is a function over U M for n-ary function or predicate symbols.
Lemma 1 (Translation Lemma [41]).If F is a formula and t is a term s.t.no variable in t occurs bound in F , then Note that we may use λ terms to represent updates to function and predicate symbols.The interpretation of a λ term is a function.
We denote Skolem symbols for n-ary functions (where n = 0 is possible) that cannot occur in input formulas.Only pre-processing methods may introduce the Skolem symbols as a guarantee that they are fresh.
Convention 1 (Variable non-capture) Throughout this paper we assume that free and bound variables are disjoint, such that when we substitute a term t for a variable x in formula F , none of the variables in t are captured.
Definition 1 (Labeled substitution).⟨x ← t; Ψ ⟩ B represents a substitution of x by t, justified by the formula Ψ .The label B is either ⊤ or ⊥ and it indicates whether the map x → t may be used as an equal replacement of Ψ .
Example 1.The labeled substitution ⟨x ← y + 1; x ≃ y + 1⟩ ⊥ represents the substitution of x by y + 1 justified by the formula x ≃ y + 1.The label ⊥ of the substitution indicates that applying the substitution on a formula F where x ≃ y + 1 is present does not change the set of models of the formula.
and an interpretation M, we define the interpretation Mθ as follows: and a formula F , we define the formula F θ as follows: Informally, a sequence of substitutions θ is applied to interpretations from right to left (i.e.backwards), while to formulas from left to right (i.e.forward).Further, note that the translation lemma generalizes in a straight-forward way to substitutions.

Incremental Pre-processing
In this section we introduce a calculus to describe incremental pre-processing for SMT based on the following notion.
Definition 4 (Simplification modulo θ).We say that the formula F simplifies to with M on all symbols that are in F or in background theories or not in It follows that simplification allows transitive chaining assuming that symbols are not recycled.
Lemma 2 (Transitivity of simplification).Let F ⪰ θ F ′ and F ′ ⪰ θ ′ F ′′ such that every symbol that is both in F and F ′′ also occurs in F ′ (i.e.old symbols are not re-introduced).Then F ⪰ θθ ′ F ′′ .

Simplification rules
There are several possible situations where the concept of simplification modulo substitutions can be used to capture potential simplification steps.For example, a useful special case for simplification modulo θ is when a formula F implies an equality x ≃ t that can then be turned into a substitution to simplify F .
Example 2. The formula isCons(x) ∧ F [x] implies ∃h, t .x ≃ cons(h, t), where h, t are fresh variables (corresponding to the head and tail of a cons list).We may substitute x by cons(h, t) in F [x] to eliminate x.The literal isCons(cons(h, t)) is equivalent true and F [cons(h, t)] is a model simplification of the original formula modulo x ≃ cons(h, t).
There are also useful special cases where a formula F does not imply an equality x ≃ t, but the same equality may still be used to simplify F .Example 3. In the formula F := ((x ≃ 3 ∧ x > u) ∨ y > u) ∧ u > z we can substitute x → 3 and retain simplification.The formula F simplifies to There are also cases where substitutions are not suitable to describe the relation between F and F ′ .It is easier to characterize these by the property that F ′ is a proper subset of F .Example 4. A blocked clause p∨C can be removed from a set of formulas without changing satisfiability: F, (p∨C) ⪰ p →p∨¬C F .If we were to substitute p by p∨¬C everywhere in F it would weaken clauses where p occurs positively.
Finally, it is possible to accomodate cases where pre-processing introduces definitions, such as through the unfold transformation (see Section 6.5), or by Skolemization and Tseitin transformations.
Example 5.The Skolemization of ∀x .∃y .p(x, y) is ∀x .p(x, f sk (x)).Here the original quantified formula is replaced by the Skolemized formula.
We model the pre-processing performed by an SMT solver as a sequence of abstract states where each state consists of two components: a formula F and an ordered sequence of labeled substitutions θ.Based on the shown cases, we formulate the following conditions for applying simplification rules in Figure 1.
Fig. 1.A calculus for pre-processing in SMT We formulated the side conditions that allow to identify a minimal set of conjuncts Ψ of F involved with the solution for x.Note that a simplification remains valid when adding conjuncts that do not contain x.The Update rule handles broadly a set of simplifications, including proof rules from DRAT systems and introduction of definitions and Skolemization.It may be presented in forms where Φ or Ψ or the substitution are empty.The substitution x → t generally represents a tuple of symbols x replaced by terms t.To simplify presentation we only discuss the case where x is a single symbol and we elide rules that preserve equivalence.The Update rule records Ψ so it can later be re-added in case a new constraint mentions x.This may be overkill when Φ[t/y] = Ψ for y fresh (in Section 4 we will show another rule, Invert, that adds only the equality y ≃ t in such cases).
and by definition of the satisfaction relation, there must exists an Lemma 3 established that the side-condition for Rigid ensures simplification modulo θ.We therefore have the following corollaries.
Corollary 2. If a formula F ′ is derived from F by the inferences from Figure 1, then it has the property F ⪰ x →t F ′ .
The other rules enforce preservation of satisfiability in their side-conditions.
Corollary 3. The rules from Figure 1 preserve satisfiability.
The transitive application of the simplifications also preserve satisfiability in a way that extends the notion of simplification modulo a substitution.
Proposition 1.Consider a formula F 0 and a state F ∥ θ derived from F 0 ∥ ε using the rules from Figure 1.Then F 0 ⪰ θ F .
Proof.It follows as Corollary 2 notes that each application of a rule from Figure 1 is a simplification modulo and Lemma 2 notes that simplification modulo is transitive.
Informally, Proposition 1 means that using θ, one can transform any model of the simplified formula into a model of the original input formula.Note that the simplified F may contain fresh Skolem symbols that are not occurring in F 0 .

Pre-processing Replay
Rules of Fig. 1 captured possible pre-processing steps that can be applied on a single SMT problem.We now describe the scenario where we add additional constraints Φ to a pre-processed state.Without incremental pre-processing we have the option to conjoin Φ to the original formula F 0 and re-run pre-processing.
The goal of incremental pre-processing is to retain as much of the effect of previous work as possible.
We will show that for pre-processing steps derived by rule Rigid it is possible to apply the corresponding substitution to Φ directly, while the other simplification steps may require to re-introduce formulas that were previously removed.We call this process of applying the effect of simplifications on a new formula as pre-processing replay.Figure 2 shows an imperative implementation of preprocessing replay.
Replay (formula Φ, substitution sequence θ = σ1, . . .σn ) Our main proposition summarizes the main property of Replay and ensures that an arbitrary formula Φ can be added mid-stream after pre-processing.
Proposition 2. Let F ∥ θ be a state resulting from pre-processing F 0 , and let F ∧ Φ ′ ∥ θ ′ be a state produced by applying procedure Replay to Φ and θ, then To establish Proposition 2 we will introduce a calculus for reverting the effect of simplifications.It is shown in Figure 3 and comprises of two rules, one for adding a formula with a substitution to F , the other both reverts the effect of a simplification and adds the reverted formula to F .The inferences rely on a side-condition that the formulas Φ, Ψ are clean relative to the substitution θ.
Thus, intuitively, Φ is clean w.r.t.θ if Φθ uses only Rigid substitutions from θ.We now establish that formulas that are clean relative to θ can be added (after substitution) to formulas while maintaining models.The substitution used in rigid updates corresponds to equalities that are consequences.Lemma 5. Given a state F ′ ∥ θθ ′ derived from the state F ∥ θ and formula Φ that is clean with respect to θ ′ , then Proof.We examine the two directions.
-Let M |= F ∧ Φ. Induction on the length of the derivation from F to F ′ establishes that if M |= F , then there is a corresponding . The equality can be added to the result, . Thus, the resulting model M ′ can be constrained to satisfy all equalities used in rigid substitutions.Since M ′ |= Φ already, then Then from the assumption of simplification modulo θ ′ , we get The correctness of the Add rule is now immediate: Corollary 4. Let F ∥ θ be derived from F 0 ∥ ε, and Φ clean with respect to θ, then F 0 ∧ Φ simplifies modulo θ to F ∧ Φθ.
Proof.It follows from Lemma 5.
With Proposition 1 we established that Rigid, Flex and Update maintain F 0 ⪰ θ F .We need to show that also for rule Undo.The first step is to establish that the formula removed by each of the pre-processing rules can be re-added without affecting simplification.Lemma 6.Given an inference F ∥ θ =⇒ F ′ ∥ θ⟨x ← t; Ψ ⟩ B by either of the rules Rigid, Update, Flex the formula F simplifies to F ′ , Ψ modulo ε.
Proof.The proof is by case analysis by the rule that is applied.It is worth examining why the side-conditions for simplification modulo are used.As the following example shows, transformations that only preserve satisfiability but strengthen formulas cannot be used easily in an incremental setting.Example 6.Let F 0 be the satisfiable formula x ≃ y ∧ y ≤ z ∧ z ≃ v.In that formula x, y are equal, and z, v are equal.Lets assume that we simplify via the solution where the classes are merged (i.e.where y ≃ z).It is satisfiability preserving.It suggests a transformation that we call Flex † .
The resulting state is still satisfiable.Now Undo can be applied without any problems.The result is still satisfiable, but not equivalent to F 0 (does not have the models where the two equivalence classes are not merged).
Adding the constraint y ≃ z − 1 to F 0 would be satisfiable, but adding it to our formula is unsatisfiable.

Simplifcation Methods
Many simplification methods used in practice during pre-processing are equivalence preserving.These methods include formula rewriting, constant propagation, NNF conversion, quantifier elimination, and bit-blasting.They do not require the methodology from this paper and have been integral in Z3 since its inception.We will here discuss main simplification pre-processing routines that do not preserve equivalence and how they relate to our taxonomy.

Equality Solving
One of the most useful pre-processing techniques eliminates symbols when they can be solved, that is, a constraint implies an equality x ≃ t, where t is a term that does not contain x.Equality solving corresponds to finding unitary solutions to unification problems modulo theories.Most uses of equality solving are captured by transformations justified by rule Rigid.In Z3, equality solving comprises of a two stage process: 1. Extract a set of solution candidates E implied by the current formula φ.

2.
Extract from E a subset of solutions that can be oriented without introducing cyclic dependencies.
To elaborate, let E be a set of solution candidates x 1 = t 1 , . . .x n = t n .The candidates may contain multiple equalities using the same symbol.For example, E could be x = f (x), x = g(y), y = h(z).We can't use the solution x = f (x) because x already occurs in f (x).But we can use the solution x = g(y), y = h(z) processed in this order as first x is replaced by g(y), then y is replaced by h(z).In the second stage we extract from E a subset of equalities x i1 = t i1 , . . ., x i k = t i k , where x ij are distinct and t ij are terms such that x ij ̸ ∈ t i j ′ for j ≤ j ′ .The subset is in triangular form.
Example 7. We illustrate two application of Rigid for eliminating two symbols from three equations.The choice of the first two equations is arbitrary.An alternative simplification could choose to eliminate x and z instead.It is not possible, however, to eliminate all three variables.
The set of unification modulo theories facilities used in Z3 is based on extracting simple definitions.Foremost, for a conjunct x ≃ t of φ, where x is uninterpreted, x ̸ = t, include the equality candidate x ≃ t.Other equality candidates are included from formulas of the form ite(c, x ≃ t, x ≃ s) and arithmetic equalities of the form x + s ≃ t, such that x ≃ t − s is a solution candidate for x.Note that solution candidates are not necessarily unique for an equality.The constraint x + y ≃ t can be used as solution to both x and y.If x has a nested occurrence within t, the solution for y, but not x, can be used.Equality solving interacts with simplification pre-processing: equalities over algebraic data-types can be assumed to be in decomposed form already since rewriting simplification decomposes equalities of the form cons(h Equality solving can be extended modulo theories in several directions.Arithmetical equalities can be extracted from Diophantine equations solving and polynomial equality factorization as part of establishing a Gröbner basis.Equalities can be extracted from inequalities [6,31], other theories, such as the theory of arrays allow extracting solutions from equalities store(a, i, v) ≃ t, where a is a symbol that does not occur in t, i, v, as a ≃ store(t, i, w), together with the constraint select(t, i) ≃ v, where w is fresh.We leave a study of the cost/benefits of these approaches within the context of incremental preprocessing to future work.
Equality solving is extended to sub-formulas in the following way: When a positive sub-formula implies an equality x ≃ t and the symbol x does not occur outside of the sub-formula then x can be replaced by t within the subformula.The solution is no longer rigid constrained but can be justified by Flex.
Example 8. Suppose x ̸ ∈ F, Ψ , then we can use Flex to justify the simplification

Unconstrained sub-terms
Symbols that have a single occurrence in a formula may be solved for based on context.For example, with the formula x ≤ y, y < z, z ≤ u, p(u), q(u), the constant x can be eliminated by using the solution x ≃ y.Then y can be eliminated by setting y ≃ z − 1, and finally z ≃ u.
Invertibility of unconstrained symbols (see e.g.[8,7]) in an incremental setting for bit-vectors was introduced in [19].The method implements the following proof-rule, exemplified for the term x + t, containing the only occurrence of x.
To justify rule Invert in our setting, it suffices to check the condition from Lemma 6.Alternatively, we can use the generic rule Update when applying unconstrained simplifications.The rule Invert is more efficient than using Update because the latter requires adding back an entire conjunction Ψ where the invertible term x + t occurs.Invertibility can also be used to justify elimination of nested definitions.For a definition F ∧ (( where y is a fresh Boolean symbol.
Invertibility conditions are theory dependent.Figure 4 exemplifies main invertibility conditions for arithmetic 3 .
Z3 uses a heap ordered by occurrence counts to identify candidates for invertibility.It first processes all symbols with occurrence count 1.If it is possible to eliminate a symbol with occurrence count 1, the occurrence counts of sub-terms under the term that gets eliminated are decreased.The elimination process stops once the heap only contains symbols with occurrence counts above 1. 4. Invertibility rules for symbols x, x ′ that occur uniquely in F ; y is fresh.

Symbol Elimination and Macros
SAT solvers use symbol elimination [15] to simplify clauses.The first-order version [11] remains timely in more recent works as well [28].A predicate p can be eliminated if it occurs at most once in every clause either positively or negatively.Clauses that contain p are replaced by resolvents by applying binary resolution exhaustively, and then remove clauses containing p.
Example 9. We illustrate symbol elimination for the ground case with two clauses, and F such that p ̸ ∈ F , as an instance of the Update rule.
The same elimination technique can also be applied to Horn clauses where p does not occur both in the head and body of any rule.A solution for the eliminated predicate is a conjunction of the upper bounds for p or a disjunction of lower bounds for p.It is generally a quantified formula.If the involved clauses admit quantifier free interpolants, the solution can also be computed using an interpolant from a solution to the reduced system [4].Thus, the term t in a substitution x → t may only be computed after an initial model is known.
There are many cases where symbols can be eliminated incrementally and justified by the Rigid rule: Then replace f (a) by ite(ψ, t, f ′ (a)) and add the clause ∀x .f ′ (x) ̸ ≃ t.
Macro elimination can be extended to ordered structures and in combination of theories [42].It has been integral to making quantified reasoning with bitvectors [44] practical.We claim that first-order in-processing rules based on blocked clauses, asymmetric tautology elimination, covered clauses known from SAT [29] can also be captured by Update.We substantiate the claim with an example, but leave a comprehensive treatment for future work: Example 10.Consider the clause C := p(x) ∨ q(x) and F := ¬p(x) ∨ p(f (x)) ∨ r(x), ¬p(x) ∨ p(f (x)) ∨ p(g(x)).The variable x is universally quantified.Then C can be rewritten to p(x) ∨ q(x) ∨ p(f (x)) without affecting satisfiability.The covered literal p(f (x)) was added to C as it occurs in every resolvent with p(x).The model for p has to be fixed, however.The model update is a first-order lifting of the propositional case.

Implementation
We have implemented incremental pre-processing as an integral component of a new SMT solver, part of Z3.It can be enabled by setting the option sat.smt=true from the command line.It includes simplification by equality solving, elimination of uninterpreted sub-terms and macro detection as described in Section 4 4 .The primary reason for supporting incremental pre-processing has been usability.GitHub issues pointing to performance cliffs when switching to incremental mode are recurrent.A distilled example where pre-processing can solve formulas is as follows: Example 11.Consider the benchmark.
Simplifiers interoperate with user scopes: SMT solvers support scoping using operations push and pop.All assertions made within a push are invalidated by a matching pop.To allow simplifiers to inter-operate with recursive function definitions they track symbols used in the bodies of recursive functions as frozen.Those symbols are excluded from solving.Similar to CaDiCaL's implementation for replaying clauses (see [18]), our implementation of Replay stores the domain of θ in a hash-set to bypass processing formulas that have no symbols in θ.
6.1 Pre-and in-processing for SAT and QBF Pre-processing for SAT has received significant attention with the milestone work in Satelite [15] and then using notions of blocked clauses [27] and solution reconstruction [25].Pre-processing techniques for QBF are discussed for example in [3,22].The main pre-processing methods for propositional satisfiability solvers can be captured using our rule Update (see Example 4 for an instance of blocked clause elimination simplification).For the case where ¬p ∨ D is a blocked clause, the model update is the de-Morgan dual: removing ¬p ∨ D triggers the update The work [18] introduces an inference system that also addresses redundant clauses and represents model updates using a notion of witness labeled clauses.The semantic content of the rules used for SAT are captured by Update.However, we elided tracking redundant clauses in this work.The case for SMT motivate specialized rules Rigid, Flex and Invert.

Pre-processing for SMT
Pre-processing simplification is integral in all main SMT solvers, including [33,2].Incremental pre-processing with special attention to bit-vectors was introduced in [19].Transformations considered in this thesis can be represented by the Rigid and Invert rules.Z3 exposes pre-processing simplifications as tactics [13] and allows users to compose them to suit specific needs of applications.
Invertibility conditions are used in [34] to guide local search.This work considers also a candidate value of all symbols.For example, F [x • t] is invertible to F [y] if t evaluates to 1.

Pre-processing for MIP
Pre-solving is terminology for pre-processing for mixed-integer linear programming solvers.There is a significant repertoire of pre-solving methods integrated in leading MIP solvers.Their effects are well documented in the newer survey [1], which provides an updated perspective to [20].Pre-solving was developed earlier in [40].The main methods can be categorized as operating on single rows (single constraints) or single columns (single variables), multiple rows, and multiple columns, and use global information about the tableau.They include also methods known from other domains, such as literal probing also found in SAT solvers, and symmetry reduction for sparse systems [38].We are not aware of under-constrained simplifications used in mainstream MIP solvers.Only symmetry reduction stands out as outside the scope of incremental pre-solve methods.
Example 12. Pre-processing that combines two rows or combines two columns relies on efficient indexing [21] to be effective.The two column non-zero cancellation method considers the situation where the coefficients to two variables maintain a high degree of correlation.Consider the following formula 2x + 4y + z ≤ 5 ∧ x + 2y + u ≤ 6 ∧ 3x + y + z ≤ 3 ∧ φ where x, y ̸ ∈ φ.
The coefficients to x, y in the first two inequalities are related by the affine relation given by λ = 2.In this case the system can be reformulated, justified by rule Rigid, by introducing a fresh variable v and using the inequalities 2v + z ≤ 5 ∧ v + u ≤ 6 ∧ 3v − 5y + z ≤ 3 ∧ φ.
6.4 Pre-processing in first-and higher-order provers Pre-processing is also an important part of first-order theorem provers.Techniques for creating small clausal normal forms have long attracted attention [35].Main simplifications [24] are based on detecting definitions similar to what is described in Section 4.3, but with the extra twist of ensuring that simplifications preserve first-order decidability, such as ensuring that formulas remain within the EPR fragment.Furthermore a variant of AIGs with nodes representing quantifiers are used to detect shared structure.While [24] is only concerned establishing preservation of satisfiability we note that the classification as model equivalent from Section 4.3 extends to the cases considered.In-processing inspired by SAT was pursued for first-order [29,43] and recently for higher-order settings [5].

Constrained Horn Clauses
Constrained Horn Clauses [4], enjoy a tight connection with Logic Programming where several transformation techniques were developed [10,12], including incremental consequence propagation [36].Fold [9] transformations introduce auxiliary predicates and rules that correspond to replacing a code-block with an auxiliary procedure.It is justified by Rigid.Unfold transformations can be justified by Update and correspond to macro elimination.

Summary
We introduced a calculus of pre-processing for SMT.It distinguishes simplifications that are rigid and so can be applied to new formulas as substitutions.Other simplified formulas may need to be re-introduced similar to re-introducing removed clauses in SAT.We examine several of the pre-processing methods studied in SAT, ATP, MIP and SMT as instances of the calculus.We leave empirical and algorithmic studies of new pre-and in-processing methods to future work.Another angle we have left on the table is reconciling pre-processing with inprocessing.For SAT, it was useful to develop a calculus that accounted for both irredundant and redundant clauses.In our current effort we have set this angle aside in favour of establishing main properties on replaying substitutions.
An important class of simplifications are based on eliminating variables by finding solutions to them.In the formula x ≤ y + 1 ∧ x ≥ y + 1 ∧ φ[x, y] we can solve for x (or y) by setting x ≃ y + 1 and then substituting in the solution for x into φ.The simplified formula is φ[y +1, y].The set of models of the original formula must all satisfy the equality x ≃ y + 1.This property allows to reuse the simplification when later adding a formula ψ[x, y].It can be added by applying the solution for x: ψ[y +1, y].A model of φ[y +1, y]∧ ψ[y + 1, y] must conversely correspond to a model of the original formulas φ[x, y] and ψ[x, y].

Fig. 3 .
Fig. 3.A calculus for reverting pre-processing.Undo reverts a simplification by reintroducing a constraint.It prunes θ until Add applies for a new constraint Φ.
From the side condition Ψ ⪰ x →t Ψ [t/x] every model of F there is a model of Ψ [t/x] that agrees with symbols from F .Conversely F ′ , Ψ properly contains F and therefore implies it.Therefore, F ⪰ ε F ′ , Ψ .-Update:Wewant to show that F, Ψ simplifies to F, Ψ, Φ modulo ε.The premise of Update ensures that for every M |= F, Ψ there is a model agreeing with M on symbols in F, Ψ , that satisfies F, Φ.Since interpretation of the symbols in Ψ is unchanged it also satisfies Ψ .Conversely, if M ′ |= F, Ψ, Φ, then already M ′ |= F, Ψ and thereforeM ′ ε |= F, Ψ .-Rigid:We wish to establish that F ⪰ ε F ′ , Ψ .First observe that F ′ , Ψ = F, Ψ [t/x].Since Ψ implies the equation ∃y .x ≃ t, every model of F implies there is a solution to y such that Ψ [t/x] that agrees with the variables in F .Conversely, if F, Ψ [t/x] is satisfied by M ′ , then M ′ already satisfies F .
With Corollary 4 and Lemma 7 we have then established Proposition 2.