1 Introduction

Larry Wos wrote the president’s column of the AAR Newsletter from its first installment in March 1983 till #131 in May 2020. The text below was not originally planned for the AAR newsletter, but was written for a colleague of his, to show a student how such a column can be written. Larry sent this final manuscript to Sophie Tourret in 2020 asking for feedback, but passed away before they could even start a discussion about it. We do not know if the text ever made its way to the intended recipient, but, for sure, it would have made it to a later issue of the AAR newsletter, had Larry had time to finish it. Thus, we have decided to publish excerpts from this final manuscript, put them into the historic context, and explain their relevance to automated reasoning research today.

2 The Column

The typical column by Larry contained at least two elements: anecdotes on the (early) history of automated reasoning followed by an open problem. Open problems came mainly from three areas: (finite) group theory, properties of metalogic formulations from the condensed detachment or equivalential calculus, or logic puzzles. We present three excerpts from Larry’s final manuscript: two anecdotes from the early history of automated reasoning—namely on the relevance of instances and the set of support strategy—and the Candy Puzzle.

Finding Relevant Instances: A key technique in (refutation-based) automated reasoning is to identify relevant instances of universally quantified formulas. The unpublished column contains several remarks by Larry emphasizing the role of finding such instances.

By definition, automated reasoning refers to a field in which a computer program is written that applies logical reasoning. A person can reason in a manner that is unavailable to an automated reasoning program, a way that is called instantiation. Instantiation is a type of reasoning in which variables can be replaced with other variables or with terms.

Instantiation of first-order logic formulas was actually the basis of early attempts to automated reasoning [12, 24] able to already automatically prove simple valid first-order formulas. While Gilmore [24] resolved quantifier alternations by so called “multiplications” of ground instances of the formula, Davis and Putnam [12] already applied Skolemization to existentially quantified variables and instantiated with ground Herbrand terms. Noneveless, citing Prawitz “the main weakness of the first programs for theorem proving was the manner of generating substitution instances” [34]. Resolution was seen as—and actually still is—a major breakthrough, because it does not require guessing ground instances. However, if the set of ground instances of clauses needed for a proof is small, then explicitely generating these ground instances is to be preferred over resolving among the clauses. This fact is reflected by state-of-the-art research where, e.g., InstGen [22] is currently the best single reasoning approach to first-order problems where a rather small set of ground instances from the input clause set suffices. InstGen is based on the instantiation of clauses guided by an abstraction of the first-order clauses to propositional clauses. However, building resolvents on input clauses can lead to exponentially shorter proofs compared to solely reasoning on instances, even for decidable fragments and small clause sets with only a few constants as the only function symbols [20, 33]. For an overview of recent reasoning methods incorporating instantiation, see [6].

For example, if you are studying groups in which the square of every element is the identity e,and if you are employing the function fto mean product, then, with instantiation, you can obtain \(f(f(u,v),f(u,v)) = e\)from \(f(x,x) = e\).Since instantiation is unavailable for an automated reasoning program, the obtaining of the preceding equation, \(f(f(u,v),f(u,v)) = e\),will not occur.

The statement “instantiation is unavailable for an automated reasoning program” by Larry means that finding the “relevant” instances towards a proof is often as hard as finding the proof itself. Since validity is not decidable in equational logic, finding the right instances cannot be decidable as well, in general, hence instantiation by relevant instances is unavailable for an automated reasoning program. Technically, almost all automated reasoning calculi enable the addition of instances. The work of [22], see above, was actually an important step towards finding relevant instances through an abstraction into a decidable fragment, in this case propositional logic. For an overview on possibilities towards finding relevant instances, consider again [6]. Reasoning by SMT [32] (Satisfiability Modulo Theories) is built on these ideas, see below.

Nevertheless, group theory, and equational reasoning in general is a nice example for the power of instantiation. Ground equations can always be oriented, e.g., by a Knuth–Bendix ordering [26], and Knuth–Bendix completion terminates on a set of ground equalities and disequalities. Oriented equations are rewrite systems that are fundamental for many concepts in computer science such as the semantics of programming languages and have therefore obtained a lot of attention [11, 13, 27]. An alternative approach, typically implemented in SMT  [3, 29, 32] solvers, is to decide the satisfiability of a set of ground equations by congruence closure [31]. Therefore, if a finite set of needed (ground) instances for a refutation is known in advance, then the problem of whether a conjectured equation is a consequence of other equations turns from being undecidable, in general, into a decidable problem.

1979 was the year that Veroff began to express an interest in automated reasoning, and he has remained interested ever since. In 1996, Veroff introduced his hints strategy for directing a program’s reasoning in a manner that proved to be more effective than is weighting. He introduced his “sketches” in 2001, an approach that enabled him to answer various open questions.

Veroff introduced hints [42] to direct the proof search, integrated in Otter [28], the role model of all modern theorem provers, designed by Bill McCune. Otter has a sophisticated weight system for clauses. In its simplest form, the weight of the clause is the sum of the weights of its symbols, and clauses with a small weight are preferred for performing inferences during proof search. A clause weighting mechanism is an inherent ingredient of all of today’s implementations of saturation-based theorem provers [5, 18, 36, 38, 41, 45]. The hint technique introduced by Veroff adds to the clause weighting mechanism: clauses subsumed by the hint clause or clauses subsuming the hint clause receive special treatment with respect to their weight. For example, if certain clauses sharing some abstract structure are thought to be essential in finding a proof, then a clause representing this abstract structure can be used as a hint clause. Then any clause that is subsumed, i.e., a superset of an instance of the hint clause, can be preferred for performing further inferences. In an extreme setting, hints can simulate the restriction of inferences to certain ground instances. Veroff successfully used the clause hint technique to find and analyze proofs [43] and combined it with another technique called proof sketches. A proof sketch is an incomplete, possibly incorrect proof that is then used to guide the search for a correct proof. A proof sketch may be obtained by an abstraction of the actual problem, for example by replacing non-variable terms with fresh variables.

This idea of restricting inferences and instantiation to certain patterns plays also an important role in SMT solving. Based on ideas from the Nelson-Oppen combination procedure [7, 9, 10, 16, 30], modern SMT solvers are highly efficient decision procedures for the ground combination of several theories, e.g., ground first-order logic with equality combined with ground arithmetic—on this topic, see also The Candy puzzle, page 5. If this approach needs to consider problems that also contain universally quantified formulas, it does so by consecutive additions of ground instances of these formulas. The research on this topic started with the Simplify system [14], where so-called triggers are used to generate ground instances out of universally quantified formulas. A trigger is a term and only instances of this term are used for proof search, similar to the concept of hints. Simplify makes additional use of the decidability of equational reasoning on the ground level. It builds an explicit model of the generated congruence and then prefers instances of the trigger that are related to the congruence. This idea was generalized to consider ground models not only for an equational theory, but also for other theories for restricting the generation of ground instances [15, 23]. More recently, it has turned out that, as a last resort and if done with care, even an enumeration of possible ground instances can result in an efficient system [35]. In summary, the ideas suggested by Larry and his collaborators have influenced future research and are still further developed nowadays, in particular in the areas of first-order theorem proving and SMT solving.

The Set Of Support Strategy: Another anecdote by Larry is on the origin of the set of support strategy [47].

In a summer visit, Willam F. Miller, Director of the Applied Mathematics Division at Argonne National Laboratory, invited John Alan Robinson. Miller introduced Robinson to Larry Wos and to Dan Carson, an introduction that had unbelievable consequences for the field that would eventually be called automated reasoning. While Robinson was visiting Argonne, he introduced his new inference rule that he called binary resolution. The introduction of the inference rule binary resolution changed the course of history for automated reasoning forever.


When Carson, who was a brilliant programmer in IBM assembly, learned of the new inference rule, he wrote a mechanical theorem-proving program encoding the rule. Carson and Wos used his program in an attempt to prove a trivial theorem in group theory. Carson’s program was unable to find the sought-after proof. The theorem asserts that, if the square of every element xin the group is the identity e , then the group is a commutative group, \({f(x,y)=f(y,x)}\)for every element in the group. Carson called me by phone and said that, if I had nothing to suggest, we were finished, and he gave me forty-five minutes to come up with something. After thirty-two minutes had elapsed, he called again, and I told him about the set of support strategy. Carson eagerly asked for permission to extend his program so that the program could apply the new strategy. The rest is history: with the extended program, the sought-after proof was found and found in less than 3 CPU-seconds. If Carson had not been so impatient, the use of some type of strategy would never have occurred. He is a hero for, without strategy, proofs would almost never be found with a program that reasons logically.


Learning about the set of support strategy, Robinson set about to prove that the strategy is refutation complete.

The encoding of the group theory used at that time by Wos and his collaborators was the following [46, 47]: the group operation is modeled by a ternary predicate P, where P(xyz) stands for the group multiplication of x and y resulting in z. Using this predicate, basic properties of a group can be axiomatized. For example, associativity was represented by clauses like \(\lnot P(x,y,u) \vee \lnot P(y,z,v) \vee \lnot P(u,z,w) \vee P(x,v,w)\) by introducing explicit result variables for the sub-products. However, a group theory axiomatization solely built on this predicate P is not complete, in general, because P is a relation and does not represent that the group operation is a function nor that it is total. For cases where this is needed, e.g., proving that a group where the square of each element is the identity is commutative, an additional binary predicate R for equality was added together with the equivalence relation axioms for R and the congruence axioms for P and used functions. For example, for the function f expressing totality by the unit P(xyf(xy)) the congruence axiom \(\lnot R(x,x') \vee \lnot R(y, y') \vee R(f(x,y),f(x',y'))\) was added, for the group multiplication the congruence axiom \(\lnot R(x,x') \vee \lnot R(y, y') \vee \lnot R(z,z') \vee \lnot P(x,y,z) \vee P(x',y',z')\). The motivation for this encoding was to not expose the group multiplication to full equational reasoning which turned out to be far more successfull, because Knuth–Bendix completion and equational reasoning [26, 37] were not yet developed at this time. The problem was attacked by resolution through a partial axiomatization of equality, depending on what was needed for the actual problem at hand. A number of different axiomatizations following this approach can be found in [46]. Using such a formulation without specific equational reasoning, such problems are still a challenge to today’s automated reasoning systems. Examples of such encodings are contained in the TPTP [40], e.g., the above-mentioned problem is called GRP001-1 in the TPTP.

The set of support strategy consists of dividing an unsatisfiable clause set into two disjoint sets N and S. Then, any resolution inference is restricted to use at least one clause from S, the set of support. Newly derived clauses are added to S. Wos together with his collaborators showed [47] that this restriction of the resolution calculus is complete if N is satisfiable. In a setting where a conjecture is to be shown from a set of satisfiable axioms, such a setup can easily be obtained by putting the axiom formulas into N and the (negated) conjecture into S. Recently, it has been shown that for completeness of the set of support strategy, it is necessary and sufficient that there exists a resolution refutation with at least one clause from S [25]. The set of support strategy has become an integral part of many approaches to automated reasoning. It is implemented in all first-order logic resolution-based theorem provers [5, 18, 36, 38, 41, 45]. In case of additional ordering restrictions, it is still complete if the set N is closed under non-redundant inferences. In the purely equational case, this boils down to the generation of all critical pairs modulo rewriting and elimination of trivial equations, and in the case of first-order logic with equality to the generation of all superposition inferences modulo redundancy [1].

The Candy Puzzle: Puzzles have a long history in motivating automated reasoning research by showing deficiencies of state-of-the-art reasoning calculi or systems. An early example is Schubert’s Steamroller problem [39], a puzzle that was not easy to solve without a concept of typing and, therefore, motivated research in this direction. Larry’s manuscript contained the following puzzle.

                                       Candy Puzzle

Jane’s Confections is an old-fashioned sweet shop next to the old post office in Chicago, and sweet-toothed Illinois residents traveled many a mile to buy chocolate and caramels, as they did when they were kids.

This afternoon, four locals, all on their way to collect their children from a nearby junior high school, have popped in to buy a bag of something scrumptious. From the clues, can you say at what time each customer called, and what weight of what sugary treat Jane sold them?


Tobias nipped in later than Sarah and bought one and a quarter pound more of his chosen sweet than the purchaser of toffee bought of toffee.


The customer who bought 3 pounds of creams was not the customer who came in as the clock struck 3.


The second customer of these four bought one and a half pounds more humbugs than Ursula bought of her favorite confection.


Virgil bought one and a half pounds more of his selected sweet than did the person who walked into the shop at 3:10 but who did not buy lemondrops.


The customers called at 3, 3:05, 3:10, and 3:15.


They bought one and three quarter pounds, three pounds, three pounds and a quarter pound, and four and a half pounds.

The first exercise is to solve the puzzle with the help of an automated reasoning system:


  1. (1)

    Formulate the puzzle in logic and let an automated reasoning system find the solution.

Implicitly, the assumption is that all mentioned constants occur exactly once in the solution. For example, every customer does exactly one purchase, each different sweet is bought exactly once, etc. Then the puzzle boils down to a finite domain problem with \(4^4\) different possibilities of a single purchase. It can then be formalized in pure first-order logic—without theories—or even translated into propositional logic and solved that way. Solving the puzzle amounts to finding a model for the formalization, a task that is more difficult than finding a refutation (proof). We did a formalization in first-order logic and both SPASS [45] and Vampire [36] find a solution in less than a second thanks to splitting [44]. Also the grounding of the first-order formulization can be immediately solved by any SAT solver starting at the performance of MiniSat [19]. However, these formalizations require the explicit axiomatization of the needed arithmetic and linear-order concepts and properties involved in the puzzle. A more natural formulization is possible in a first-order logic over linear arithmetic. We could use a four-place predicate P where P(uxyz) means that u bought the amount x of y at time z. For the timing we use the values 300, 305, 310, and 315 and for the amount the values in quarter pounds, i.e., 7, 12, 13, and 18. Then a complete formalization is as follows. Firstly, we declare the existence of the four purchases

$$\begin{aligned}&P(\text {Tobias}, at, bt, ct) \wedge P(\text {Sarah}, as, bs, cs) \quad \wedge \\&\qquad P(\text {Ursula}, au, bu, cu) \wedge P(\text {Virgil}, av, bv, cv)\\ \end{aligned}$$

where all symbols starting with a, b, or c are constants (existentially quantified variables). Then, we get from the first clue

$$\begin{aligned} \begin{array}{ll} 1.\hbox {a} &{} P(\text {Tobias}, x, y, z) \wedge P(\text {Sarah}, x', y', z') \rightarrow z > z'\\ 1.\hbox {b} &{} P(\text {Tobias}, x, y, z) \wedge P(u, x', \text {toffee}, z') \rightarrow x = x' + 5\\ \end{array} \end{aligned}$$

where all variables xyzu (with primes) are universally quantified and \(\wedge \) binds stronger than \(\rightarrow \). The second clue becomes

$$\begin{aligned} \begin{array}{ll} 2 \quad &{} P(c_{21}, 12, \text {creams}, c_{22}) \wedge c_{22} \not = 300\\ \end{array} \end{aligned}$$

where the constants \(c_{21}\) and \(c_{22}\) are finally mapped by the finite domain axioms to the available timings, and persons. The clues three and four become

$$\begin{aligned} \begin{array}{ll} 3 &{} P(c_{31}, au + 6, \text {humbugs}, 305) \wedge c_{31} \not = \text {Ursula}\\ 4 &{} P(c_{41}, av - 6, c_{42}, 310) \wedge c_{41} \not = \text {Virgil} \wedge c_{42} \not = \text {lemondrops} \end{array} \end{aligned}$$

and the remaining clues are contained in the finite domain axioms.

$$\begin{aligned} \begin{array}{ll} 5 &{} P(u, x, y, z) \rightarrow (z = 300 \vee z = 305 \vee z = 310 \vee z = 315)\\ 6 &{} P(u, x, y, z) \rightarrow (x = 7 \vee x = 12 \vee x = 13 \vee x = 18)\\ 7 &{} P(u, x, y, z) \rightarrow (y = \text {toffee} \vee y = \text {creams} \vee y = \text {humbugs} \vee y = \text {lemondrops}) \\ 8 &{} P(u, x, y, z) \rightarrow (u = \text {Tobias} \vee u = \text {Sarah} \vee u = \text {Ursula} \vee u = \text {Virgil}) \\ \end{array} \end{aligned}$$

Solving the puzzle formulated this way is a challenge. To the best of our knowledge, there is no automated reasoning tool that can, without further massage, derive a solution from this formalization. We tried the SMT solvers CVC4 [4] and Z3 [17] without success. Also, approaches based on resolution-style reasoning over constraint clauses will not succeed as long as the finite domain axioms do not get special treatment [2, 8]. One reason is the combination of constants and universally quantified variables over numbers which, together with first-order predicates, leads to a logic that is no longer compact, in general. Recall that compactness here means that for every infinite unsatisfiable set of clauses there exists always finite unsatisfiable subset. For example, the combination of first-order predicates and linear rational arithmetic already enables the definition of the natural numbers and the introduction of a single constant can then cause non-compactness [21]. The formulas

$$\begin{aligned} \begin{array}{l} \text {Nat}(0)\\ \text {Nat}(x) \rightarrow \text {Nat}(x+1) \\ x< 0 \rightarrow \lnot \text {Nat}(x)\\ 0< x < 1 \rightarrow \lnot \text {Nat}(x)\\ x > 0 \wedge \text {Nat}(x + 1) \rightarrow \text {Nat}(x) \end{array} \end{aligned}$$

define the natural numbers with respect to a universally quantified variable x ranging over the rationals using the predicate \(\text {Nat}\). Non-compactness arises when this definition is combined with the four formulas

$$\begin{aligned} \begin{array}{l} \text {Nat}(a)\\ P(0)\\ P(x) \rightarrow P(x+1)\\ \lnot P(a)\\ \end{array} \end{aligned}$$

for some constant a. The combination of all the formulas is unsatisfiable; however, every finite grounding of the formulas has a model. Still the above formulation of the puzzle is solvable, and even decidable because of the finite domain axioms. Further challenges are the following:


  1. (2)

    Is the solution from the clues unique?

  2. (3)

    If clue six is removed, is it possible to derive automatically the amounts for the purchases?

Conclusion: The scientific contributions by Larry Wos have shaped automated reasoning. Even in his later years, his columns continued to offer relevant insight into the past of automated reasoning and presented challenges that are still relevant to automated reasoners nowadays. Larry Wos was a founder of automated reasoning. He died on 20 August 2020.