Keywords

1 Introduction

Symbolic computations arise in many mathematical proofs as well as in science and engineering. The use of computers to ensure their correctness is hence an important problem. Interactive theorem provers and computer algebra systems provide two alternative approaches. Most interactive theorem provers have extensive libraries in analysis [6], based upon which one can verify correctness of computations with a very high level of confidence. However, the learning curve for using such libraries is quite steep. On the other hand, computer algebra systems, such as Mathematica, Maple, etc, aim to perform computations automatically. However, it is difficult to guide the computation if the automatic procedure fails, and the correctness is not fully guaranteed. Indeed there have been examples of mistakes made by such computer algebra systems in the past [11].

Previous work [18] introduces a system for performing and verifying symbolic computation as an extension to the HolPy interactive theorem prover [19]. The user can perform calculation of definite integrals step-by-step, using rules such as substitution, integration by parts, etc. Each step has a relatively simple implementation, and proofs in higher-order logic can be constructed automatically from the sequence of steps, which in turn can be checked by the HolPy kernel. This provides a user experience which can be seen as a mix between the two approaches discussed above, combining the more intuitive feel of computer algebra systems with higher level of confidence in the results.

In this paper, we present a significant extension to the work in [18], forming an independent tool named Iscalc (Interactive symbolic calculations). In particular, we make the following extensions aimed at greater safety, extensibility, and ability to handle a wider range of examples.

  1. 1.

    We introduce user-level definitions and dependency among computations, allowing construction and reuse of custom theories. This is achieved by maintaining contexts, which contain the list of existing definitions and identities, as well as assumptions in the current computation.

  2. 2.

    We introduce systematic checks on wellformedness of expressions and side-conditions for applying certain rules within Iscalc (rather than only when reconstructing proofs). This increases confidence in the computation without proof reconstruction.

  3. 3.

    In addition to definite integrals, the tool now supports computation with limits, series, and indefinite integrals. We also support improper integrals, and many more techniques of computation, such as series expansions and differentiating under the integral sign.

  4. 4.

    With only few exceptions (such as partial fraction decomposition), all functionalities are now implemented independently rather than depending on SymPy. We found this approach, aimed at avoiding problems caused by limitations of SymPy, to be more flexible and extensible in the end.

One of our main aims and yardstick for measuring progress is verifying computations from the textbook Inside Interesting Integrals [17]. This book contains many computations of integrals using a variety of techniques, including differentiating under the integral sign, series expansions, and so on. Many computations are quite involved (the longest example we did, Ahmed’s Integral, is 4 pages long in the book). We also carry over and complete some of the case studies in [18].

Our aim is to provide a user interface that is more intuitive and accessible to mathematicians and engineers. In particular, computations are displayed in LaTeX form, and whenever there is tension between conventional mathematical language and the more precise formal language, we prefer the former. We take the best-effort approach to correctness, providing systematic checks for the usual mistakes, such as cancelling expressions that may be zero, or exchange of sums that are not absolutely convergent. However, full correctness guarantees in the sense of interactive theorem proving is not achieved without proof reconstruction, which we leave to future work. In this respect, our approach is more similar to SMT solvers and program verification tools based on them, which sacrifice some correctness guarantees for more efficiency and speed of development.

We now give an outline for the rest of this paperFootnote 1. Section 2 describes the overall architecture of Iscalc. Section 3 shows results of case studies, and gives some interesting examples. Section 4 discusses some lessons we took from this work, especially for user interface design. Section 4.1 discusses related work and Sect. 5 concludes the paper.

2 Architecture

Iscalc has a layered architecture consisting of several modules, as shown in Fig. 1. In this section, we begin with some preliminary definitions, then describe the functionality of each module in turn.

Fig. 1.
figure 1

Overall Architecture

2.1 Preliminaries

The term language of Iscalc inherits from that in [18], but with extensions for limits, summation, and indefinite integrals. The full syntax is as follows.

$$\begin{aligned} e:= & {} v \,\vert \, c \,\vert \, e_1\ \textit{op}\ e_2 \,\vert \, f(e) \,\vert \, \textsf {Deriv}(e,v) \,\vert \, \textsf {Integral}(e,v,a,b) \,\vert \, \\{} & {} \textsf {Limit}(e,v,a,\textit{dir}) \,\vert \, \textsf {Sum}(e,i,a,b) \,\vert \, \textsf {IndefiniteIntegral}(e,v,\textit{deps}) \,\vert \, \textsf {Skolem}(n,\textit{deps}) \end{aligned}$$

Constructors on the first line stand for variables, constants, operators, function applications, derivatives, and definite integrals, respectively. Constants are extended to include positive and negative infinities. Constructors on the second line are new, and we explain them in more detail.

\(\textsf {Limit}(e,v,a,\textit{dir})\) represents the limit of expression e as variable v goes to expression a, here \(\textit{dir}\) represents the direction of the limit. That is, we distinguish between \(\lim _{x\rightarrow 0+} f(x)\) and \(\lim _{x\rightarrow 0-} f(x)\), etc. \(\textsf {Sum}(e,i,a,b)\) represents summation of expression e as the integer index i goes from a to b (inclusive, except when \(b=\infty \)). \(\textsf {IndefiniteIntegral}(e,v,\textit{deps})\) and \(\textsf {Skolem}(n,\textit{dep})\) are used together for computing with indefinite integrals. The former represents indefinite integral of e with respect to v. When this is evaluated to an expression plus “C”, this C is represented by a Skolem term. Here \(\textit{deps}\) represent the additional variables that C may depend on, which comes from the list of dependent variables \(\textit{deps}\) of the indefinite integral. The use of dependent variables in evaluating indefinite integrals is illustrated by an example in Sect. 3.1.

Another extension compared to [18] is the addition of formulas. These are used to specify goals, wellformedness conditions on terms, as well as assumptions on goals and definitions. Currently we support the following constructors for formulas:Footnote 2

$$\begin{aligned} f:= & {} e_1\ \textit{op}\ e_2 \,\vert \, \textsf {isInt}(e) \,\vert \, \textsf {notInt}(e) \,\vert \, \textsf {converges}(e) \end{aligned}$$

where the binary operator op is one of \(=,\ne ,<,\le ,>,\ge \). \(\textsf {isInt}(e)\) and \(\textsf {notInt}(e)\) represent e is/is not an integer. \(\textsf {converges}(e)\) represents e is convergent, where e is a series whose upper limit is \(\infty \).

2.2 Context

In [18], each computation is independent from each other, and all available definitions and identities are built into the kernel. In contrast, Iscalc develops a system of user-level definitions and dependency between computations similar to usual interactive theorem provers. This is achieved by a hierarchy of books, files, definitions and goals. Each book consists of an ordered list of axioms, definitions, and files, and may depend on other books. Each file contains a list of goals, whose computation may depend on previous items in the book. Each definition specifies a new function along with assumptions on the arguments of that function. Each axiom or goal specifies a single expression to be proved under a set of premises. It may be marked with attributes to specify its type or how it is to be used (e.g. whether it can be used during simplification).

In the implementation, a Context object maintains the list of definitions, identities, and inequality rules available at the current file. It also contains the premises and inductive hypothesis for the current computation (these are modified when performing a case analysis or induction, as described in Sect. 2.5).

2.3 Algorithms

Iscalc implements several basic algorithms in computer algebra, for checking inequalities, simplification and normalization of expressions, computing limits, and solving equations. All of these take a Context object as input, and depend on the context information.

Inequality Checking. Unlike in the previous paper, condition checking is implemented entirely from scratch rather than relying on SymPy. It is well-known that checking inequalities involving transcendental functions is undecidable. Our goal is to perform simple rule-based reasoning automatically, leaving more involved inequalities to be proved with user guidance. The overall approach is saturation: we maintain a dictionary mapping expressions to conditions on them. Given an expression for which we wish to derive some conditions, saturation works recursively on each subexpression, matching it against the main argument of each rule (left side of inequalities, or the last argument of predicates). For each match, it looks in the dictionary for existing facts that justifies assumptions of the rule. Special reasoning is performed on numerical constants (e.g. \(x < c_1\) can be used to justify \(x < c_2\) if \(c_1\le c_2\)). Comparison between numerical constants are currently done with floating-point approximation.

The approach described here is relatively simple, and it is not difficult to ensure termination, as we only get conditions on expressions that already appear. However, in practice it can be quite powerful when combined with user-guided rewriting, as shown by the example in Sect. 3.2.

Simplification. Simplification of expressions works in mostly the same way as [18], and we restate the main ideas. We normalize with respect to AC-property of addition and multiplication, and combine equal terms. When trying to combine \(t^at^b\) into \(t^{a+b}\), we check using the current context that either t is nonzero and ab are integers, or t is nonnegative. This prevents cancellation of e.g. t/t into 1 when t may be zero.

Moreover, we apply identities in the context that are marked with the simplify attribute. These cover evaluation of functions at special values, as well as issues like removal of absolute value sign (e.g. \(|x|=x\) if \(x\ge 0\)).

Normalization. There are situations where different forms of an expression are desirable for different purposes, e.g. factorized vs. expanded form of a polynomial, single quotient vs. a sum of quotients, etc. We designed the simplifier to not make a choice in such situations. Instead, if the user wishes to convert an expression to a different form, she can specify the rewriting explicitly. Iscalc then normalizes both old and new expressions and check whether they are equal. Normalization expands polynomials and combines quotients (e.g. for checking partial fraction decomposition), and performs (among others) rewriting of logarithm and exponentials.

Computing Limits. For limit computations, we implement a simplified version of the approach by Gruntz [10]. To compute \(\lim _{x\rightarrow \infty } e\), we evaluate recursively the limit of each subexpression in e, as well as the asymptotics of approaching that limit. Possible asymptotics include powers of polynomials and logarithms, as well as exponentials. Finding the limit as x approaches other values is converted to computing the limit at infinity.

As with other algorithms, the aim is not to achieve high level of automation, but to perform the simpler limits, leaving more complex cases to human guidance (e.g. using L’Hopital’s rule or with rewriting). On the other hand, using the complete algorithm of Gruntz, or the algorithm implemented by Eberl in Isabelle [8], would certainly increase automation and range of applications.

Solving Equations. We implement simple equation solving, including isolating the expression to be solved, and solving linear equations. This is used when performing substitutions and in transforming/applying an existing equality.

2.4 Rules

Based upon the collection of algorithms in the previous section, Iscalc implements a set of rules for transforming the current expression in a computation. Currently 37 rules are available. We give some representative examples below.

Integration Rules. The list of integration rules are mostly inherited from [18]. They include Substitution, IntegrationByParts, etc. Integration identities can be applied by lookup from the context. There are also rules for more advanced techniques such as differentiating under the integral sign (illustrated in Sect. 3.1), and exchange of integral and sum (illustrated in Sect. 3.3).

Rewriting Rules. The most basic rewriting rule is FullSimplify, which applies simplification to the current expression. ApplyIdentity applies an identity from the context. This generalizes the use of Fu’s rules for trigonometric identities [9]. The rule Equation supports rewriting to another form of an expression with equal normal form. Series expansion and evaluation of series are available as two different rules (again looking up identities from the context).

Equality Transformation Rules. These rules transform one equality into another. IntegralEquation transforms an equation of the form \(\textsf {Deriv}(e,x) = g(x)\) into \(e = \textsf {IndefiniteIntegral}(g,x,\textit{fvars})\), where \(\textit{fvars}\) is the list of free variables in \(\textsf {Deriv}(e,x)\). Another very flexible rule is SolveEquation, which solves for some expression e in an equality \(s=t\) to give another equality \(e=e'\). Other examples include taking limit on both sides, applying a function to both sides, and so on.

Other Rules. Besides the above three major categories, other rules include the L’Hopital’s rule for computing limits, and rules for series manipulations.

2.5 Proof Methods

In [18], the only way to perform a computation is starting from a single expression, and applying rules to transform that expression. More complex applications necessitate more structures in the computation. We describe those supported by Iscalc briefly, as they are all familiar from other theorem provers.

Proof by Computation. To show an equality \(a=b\), perform computation on both sides until they become identical. Likewise, for inequalities, perform computation on both sides until the inequality can be shown automatically.

Proof by Transformation. Starting from a known equality \(a=b\), apply the equality transformation rules in Sect. 2.4 to obtain new equalities, until the desired one is obtained.

Case Analysis. To show a goal, divide into cases either by whether some comparison formula is true, or according to whether some expression is less than, equal to, or greater than 0. We shown an example with inequality goals in Sect. 3.2.

Induction. Some integrals involve an integer parameter \(n\ge 0\), and may be proved by induction on n. We support such inductive reasoning in Iscalc. The rule ApplyInductHyp can be used to apply inductive hypothesis at any time in the inductive branch of the proof.

2.6 Top-Level Computation, Automation, and User Interface

Based on the above rules and proof methods, Iscalc supports performing a variety of symbolic computation, including showing inequalities, checking convergence, evaluating limits, and performing indefinite and definite integrals. It is also possible to build higher-level automation on top of the rules. An implementation of Slagle’s method is inherited from [18]. It performs best-first search using algorithmic and heuristic steps for performing an integral. If the search succeeds, it outputs a sequence of rules to apply, which can then be replayed in Iscalc.

The user interface of Iscalc is mostly inherited from [18]. The primary goal is to provide a visual interface that feels similar to that of a computer algebra system, and which allows mostly point-and-click based interactions. In particular, computation steps are performed by selecting rules to apply from the menu. For certain rules, the user may need to select a subexpression of the current expression to apply the rule on, and/or choose from suggestions given by the computer (e.g. when rewriting using identities).

Additional features in the current work, such as book and file hierarchy, and proof methods, are also supported in the user interface. This includes display and navigation of book and file contents. To begin the proof of an equation, the user selects from the menu one of the proof methods in Sect. 2.5. The structured computation is then displayed in a reader-friendly format. An example showing display of file contents and a computation is given in Fig. 2.

Fig. 2.
figure 2

Screenshot of the user interface, showing part of the example given in Sect. 3.1. The menu groups related rules into categories. The Proof category contains general actions such as proof by calculation and induction. The remaining five menu categories contain rewriting rules. The left side of the main window shows division of the computation into several parts, and the right side shows the selected part as a series of computation steps. On the bottom (not shown) are space for users to enter additional information for a computation step.

3 Examples

We applied Iscalc on computations of limits, indefinite integrals, and definite integrals from a variety of sources. Three sources are inherited from [18]: an exam preparation book (Tongji), online problem lists by D. Kouba [13], and the MIT integration Bee [1]. The range of applicability is greater on these problem sets. For example, we can now perform all examples in the exponentials and trigonometric category from D. Kouba’s problem lists, while the previous work can perform only 7/12 and 22/27 examples respectively, due to limitations of SymPy as well as other unsupported features.

The main additional benchmark comes from the textbook Inside Interesting Integrals [17]. 71 integral calculations are performed in Iscalc, covering about half the content of the book, including early results about Gamma and zeta functions. Many of the remaining examples involve complex numbers and contour integration, which are not supported by the current version of the tool.

Next, we illustrate some special functionality of Iscalc using examples. From these examples, we wish to emphasize how different algorithms and rules described in Sect. 2.3 and 2.4 interact with each other, enabling a computation process that is very close to human writing.

3.1 Working with Indefinite Integrals and C

The goal is to evaluate Frullani’s integral (Sect. 3.3 of [17]).

$$\begin{aligned} I(a,b) = \int _0^\infty \frac{\tan ^{-1}(ax)-\tan ^{-1}(bx)}{x}\, dx \end{aligned}$$

under the condition \(a>0,b>0\). The computation starts by computing \(\frac{d}{da}I(a,b) = \frac{\pi }{2a}\), which follows by exchanging derivative and integral, then using the formula for the definite integral \(\int _0^\infty \frac{1}{u^2+1}\,dx\). The key step is integrating both sides of \(\frac{d}{da}I(a,b) = \frac{\pi }{2a}\) using rule IntegralEquation to obtain \(I(a,b) = \int \frac{\pi }{2a}\, da\), which evaluates to

$$\begin{aligned} I(a,b) = \frac{\pi \log a}{2} + C(b) \end{aligned}$$

Here it is important to keep track of the dependency of the constant in \(\int \frac{\pi }{2a}\, da\) on the variable b, which is kept in the argument \(\textit{deps}\) of the expression. This variable is then shown explicitly as an argument to the Skolem term C when the indefinite integral is evaluated.

Next, substitute b by a in the above equation, and from \(I(a,a)=0\) obtain \(C(a) = -\frac{\pi \log a}{2}\). Substituting back in the above equation gives the final answer

$$\begin{aligned} I(a,b) = \frac{\pi \log a}{2} - \frac{\pi \log b}{2}. \end{aligned}$$

The entire computation can be carried out in Iscalc much as described above, consisting one definition and four goals, and using 17 rule applications.

3.2 Wellformedness Checks

An example from Sect. 2.3 in [17], illustrating partial fraction decomposition, involves computing the following integral:

$$\begin{aligned} I(a) = \int _0^\infty \frac{1}{x^4+2x^2\cos (2a)+1}\, dx \end{aligned}$$

under the condition \(\cos (a)\ne 0\). One particularly tricky point is that it is not obvious why the denominator is always nonzero. This cannot be shown automatically by Iscalc. However, we can state a separate goal showing this fact by case analysis. One of the step during the computation involves an integral with the same denominator, but with bounds \((-\infty ,\infty )\), so we perform the check without any assumption on x.

We perform case analysis on whether x is equal to 0. If \(x=0\) then the goal simply reduces to \(1\ne 0\). If \(x\ne 0\), we rewrite the goal as follows (the name of the rule applied is shown at right):

$$\begin{aligned}&\ x^4+2x^2\cos (2a)+1&\\ =&\ (x^2-1)^2 + 2x^2(1+\cos (2a))&\textsf {(Equation)} \\ =&\ (x^2-1)^2 + 2x^2(1+(2\cos ^2(a)-1))&\textsf {(ApplyIdentity)} \\ =&\ 4x^2\cos ^2(a) + (x^2-1)^2&\textsf {(FullSimplify)} \end{aligned}$$

Now, from \(x\ne 0\) and \(\cos (a)\ne 0\) we get \(4x^2\cos ^2(a)>0\). Also \((x^2-1)^2\ge 0\), so the whole expression is greater than zero (and hence nonzero). The inequality checking algorithm in Sect. 2.3 is able to perform this reasoning automatically, hence showing the expression in the integral is well-defined. Interestingly, the answer \(\frac{\pi }{4\cos (a)}\) given in the book is not fully correct. It only holds when \(\cos (a)>0\). If \(\cos (a)<0\) the correct answer is \(-\frac{\pi }{4\cos (a)}\) (we can easily check there is a mistake since the integrand is always positive).

3.3 Convergence Checks

For the final example, we illustrate integration using series, as well as checking convergence. The example comes from Sect. 5 of [17]. The goal is to evaluate

$$\begin{aligned} \int _0^1 \frac{\log (1+x)}{x}\, dx \end{aligned}$$

The technique used is to expand the Taylor series for \(\log (1+x)\) (using rule SeriesExpansionIdentity), then exchange integration and summation. During the exchange the body of the sum and integral is \(\frac{(-1)^nx^n}{n+1}\). As the body changes sign for different values of n, there is potential danger that the sum is not absolutely convergent, and the exchange of sum and integral is incorrect even if the final answer is finite. To exclude this possibility, Iscalc requires the user to first show the convergence of \( \sum _{n=0}^\infty \int _0^1 \frac{x^n}{n+1}\, dx \). This is checked after the computation

$$ \sum _{n=0}^\infty \int _0^1 \frac{x^n}{n+1}\, dx = \sum _{n=0}^\infty \frac{1}{n+1} \int _0^1 x^n\, dx = \sum _{n=0}^\infty \frac{1}{(n+1)^2} $$

which is convergent by the p-series test implemented within Iscalc. This shows the exchange of sum and integral is indeed safe. The final result of the integral is \(\frac{\pi }{12}\), which can be computed in Iscalc using 10 rule applications (including 3 for showing convergence), assuming the value of some standard infinite series is already known.

4 Discussion

While there has been a long line of research on visual user-interfaces for interactive theorem proving, one persistent issue is that they are mostly limited to simple examples or narrow application areas. For large scale formalizations, the number of actions the user can perform steadily increases, so it becomes more and more difficult to organize them in the user interface. Our work can be seen as an exploration of how far we can go in the limited, but still wide area of symbolic computation. We believe the results are positive. In particular, the following design decisions contribute to controlling complexity:

  • Apply rules automatically as much as possible, so they never need to be explicitly selected by the user (e.g. normalization and inequality checking).

  • Group related identities into a single rule (e.g. integrals, series expansions, etc.). After the user selects one of these rules, performing matching on the list of available identities and provide choices to the user.

  • Group related rules into categories. For example, rules for evaluating integrals, rules for series manipulation, etc. This results in a two-level menu where the user may find appropriate rules more easily.

The end result is that the user does not need to recall names of any existing identity (in fact no names are assigned at all). Instead, all results are either applied automatically, or selected after matching from a list of suggested choices.

4.1 Related Work

There is a large body of work combining theorem proving and symbolic computation, and in user interface design for theorem provers. Some earlier works include Harrison and Théry’s “skeptic’s” approach to invoking computer algebra systems from a theorem prover [12], and Bauer et al’s Analytica [5], which implements automatic theorem proving for elementary analysis within Mathematica. We leave a detailed review to [18, 19]. More recently, Lewis and Wu [14] implemented a bi-directional interface between Lean [16] and Mathematica. Donato et al. designed an interface for constructing proofs using drag-and-drop actions [7].

There are also many implementations of proof procedures related to computer algebra. For example, the tool MetiTarski for proving inequalities by Akbarpour and Paulson [2], and the heuristic-based prover Polya by Avigad et al [4]. For computation of limits, Eberl implemented verified computation of asymptotics with generated proofs in Isabelle [8]. We do not claim our procedures to be more effective than the ones listed above, but focus on their combination with user guidance to allow performing more complex symbolic computations.

5 Conclusion

In this paper, we introduced Iscalc for performing symbolic computation interactively, as a significant extension to the system described in [18]. This results in a more extensible tool with greater range of applicability, in particular able to check difficult computations from the textbook [17], and find some mistakes in the process.

In future work, we wish to extend the functionality of Iscalc to handle complex numbers, multiple integrals, and vector calculus. One particularly interesting question is how to support evaluation of contour integrals (the formalization of which have been done in Isabelle by Li and Paulson [15]). On the applications side, we intend to explore verification of control systems [3].

Finally, more work would be required to extend the proof reconstruction in [18] to the larger set of functionality available, as well as linking with library of theorems in analysis. The custom language of expressions defined here is independent of particular choice of logical foundation, hence proof reconstruction should be possible in any interactive theorem prover.