The Tinker tool for graphical tactic development
 461 Downloads
 1 Citations
Abstract
PSGraph (Grov et al. in LPAR. Springer, Berlin, pp 324–339, 2013) is a graphical language to support the development and maintenance of proof tactics for interactive theorem provers. By using labelled hierarchical graphs this formalisation improves upon analysis and maintenance found in traditional tactic languages. Tool support for PSGraph is achieved by Tinker (Grov et al. in UITP 2014, ENTCS, vol 167. Open Publishing Association, London, pp 23–34, 2014; Lin et al. in Tools and algorithms for the construction and analysis of systems. Springer, Berlin, pp 573–579, 2016): a theorem proverindependent system, which is connected to several different provers, with a graphical user interface including novel features to develop and debug proof tactics graphically. In this paper we provide a detailed and formal account of PSGraph and show how theorem prover independence is achieved by Tinker. We then show practical use of PSGraph and Tinker by developing several proof patterns using the language and tool.
Keywords
Interactive theorem proving Tactic languages Development Maintenance1 Introduction
To fully grasp this strategy one needs to understand the detailed semantics of the various tacticals, such as REPEAT and ORELSE. For example, when does REPEAT terminate? Does it require the given tactic to run at least once? Or will it succeed if it cannot run to begin with? We have found that many mistakes are due to misunderstanding of such corner cases [28]. There is also the issue of debugging. How can one find the cause of a failure of the tactic? Or, possibly worse, the cause of a success but with an unexpected result. Such bugs are very hard to locate. Debugging is made even harder by “defensive programming” through the TRY tactical, which either applies a tactic or does nothing, as it is hard to see the overall strategy. The most common solution to find bugs is to manually break the tactic apart into subtactics and use, e.g. writeln statements to see the proof state at various points during evaluation.
The first thing to notice is that simple_quantifier_tac repeatedly (REPEAT) applies three components in sequence (using THEN). It is unlikely that the developer had such a sequential ordering in mind; it is more likely a byproduct where the tactic language enforces an order (combined with “defensive” programming using TRY).
The corresponding PSGraph draws them in parallel, which is likely to be closer to the highlevel strategy that the developer had in mind. The goal types on the wires are used to explicitly direct the goals to the suitable tactic, while also being explicit about issues such as termination conditions for loops. The advantage is that a user can read the overall strategy directly from the graph, with the justification for the choices (via goal types), without detailed knowledge of the semantics of the tacticals used^{3}. For this particular example, if we know that \(\textit{strip}\_\wedge \) strips conjunctions and simp_ex and simp_forall eliminate existential and universal quantifiers, then one can see directly from the graph what the strategy does. We will return to the details of this strategy in the next section, when describing the PSGraph language.

We give a formal account of PSGraph, including a formal operational evaluation semantics.

We encode several proof strategies in PSGraph to illustrate usability.

We provide details of how theorem prover independence is achieved by Tinker.
2 The PSGraph language
An important part of PSGraph is the goal types. We first give an exposition of goal types before discussing the PSGraph language in general.
2.1 Goal types
A goal type is a predicate over a goal. Each wire of a PSGraph is labelled by a goal type. The simplest goal types are predicates on the goal terms: e.g. the PSGraph of Fig. 1 has a wire with the goal type c(conj), which states that the conclusion of the goal is a conjunction (i.e. of the shape “\(\_ \wedge \_\)”).
The language used to express goal types was introduced in [28]. It combines atomic goal types, which are relations provided by the provers, with a Prologbased language that combines them. Following Prolog convention, constants start with lower case (e.g. x, y, xs), and goal type variables start with upper case (e.g. X,Y, Xs). Two common constants are concl, which is the goal conclusion, and hyps, the list of hypotheses. Terms used by the prover can also be encoded in the goal types. In addition to goal type variables, there is another notion of variables, which are prefixed by “?” (e.g. ?x, ?y, ?xs). They are treated as constants (or terms) in goal types. We will return to them below.
2.2 Graphical proof strategies
Figure 2 shows the types of boxes that a PSGraph may contain: an atomic tactic is a box that is labelled by a tactic of the underlying theorem prover or an environment tactic (discussed below); a graph tactic is a box labelled by a named nested graph, which is used to handle modularity; an identity tactic does not change the goal and is used to merge and split wires; a goal node contains an open goal to be proven. This can only be added or changed by tactic application; breakpoints are used to control evaluation, which we return to in Sect. 5.
At a given time during the proof of a conjecture using PSGraph, there are one or more open goals that has to be proven on the graph. A theorem prover will keep track of these in a proof state. A tactic application will be applied to one goal of the proof state and will replace this goal with new generated goals, if any.
An identity box does not change the goal it evaluates. It is used to split and merge the paths a goal can take. Figure 1 illustrates both these features, where an identity box is used together with the goal type to send a goal to the correct destination (split) and then to merge the outputs again.
The individual tactics work as follows: Open image in new window changes paired quantifiers to uncurried versions; Open image in new window removes the quantified variables if they are not used in the body; Open image in new window distribute quantifiers over \(\wedge \); Open image in new window simplifies goals with the one point rule [40] (see Sect. 6). These are attempted in the given order until one of the tactics succeeds. As simp_ex is prefixed by TRY, the tactic will do nothing if all the tactics fail. simp_forall is omitted as its encoding is similar.
In the PSGraph version of this code, shown in Fig. 3, the order does not matter: a goal will be applied to the tactic where the incoming goal type succeeds. These are therefore drawn in parallel. If there are multiple goal types that succeed then all of them are tried. For speed and maintainability, it is important to have as nondeterministic strategies as possible. One thus needs to take care when developing goal types. Note that we ignore the overall TRY tactical as the goal type on the input wire should ensure that a tactic is applicable.
Figure 3 uses the same tactics as the code. From PSGraph’s perspective, these tactics are the atoms, as they becomes “black boxes” that cannot be split further. We therefore call them atomic tactics. A box with an atomic tactic is labelled by the tactic name and optional arguments which we return to below. Figure 1 also contains a single tactic called \(\textit{strip}\_\wedge \); it will break a conjunction into a new goal for each of the conjuncts. As the feedbackloop label shows, this is applied as long as the conclusion is a conjunction (goal type c(conj)). The overall strategy is applied as long as one tactic is applicable, identified by the can_simp() goal type.
3 Proving with PSGraph
A PSGraph may be open [14], in that a wire may not have a source or not have a destination. A wire without a source is an input for the graph, while a wire without a destination is an output of the graph. To illustrate, consider a graph G with two wires without a source, labelled by goal types \(I_1\) and \(I_2\), and two wires without a destination, labelled by goal types \(O_1\) and \(O_2\):
In order to apply such a PSGraph to (possibly partially) prove a conjecture, the conjecture has to be initialised first in the proof state of the underlying theorem prover. The goal then has to be wrapped in a goal node:
Definition 1
(Goal) A goal node, represented by the type Goal, contains a goal name of the actual goal in the proof state and an environment env.
A goal node can then be added to one of the input wires. In order to add the goal to an input, it must satisfy the goal type. That is, if it succeeds for \(I_1\) then it can be added to this wire. If it succeeds for both \(I_1\) and \(I_2\) then both of them are tried separately. Once added, this goal, or any goals generated from subsequent tactic applications, will then “flow” through the graph until all the goals are on the output wires, i.e. \(O_1\) or \(O_2\). At this point evaluation has terminated:
Definition 2
(Termination (normal mode)) A graph has terminated, if for all goals g of the graph, the destination of the output wire for g is either the graph output or another goal.
We can reduce the discussion of evaluation to a single step over one of the boxes of the graph. A full graph is evaluated by repeating such steps until termination is reached. This process uses and updates two components: the PSGraph \(PG{}\), which keeps track of the state of the strategy; and a proof state \(P{}\), keeping track of the goals and the necessary bookkeeping required by the theorem prover. The proof state is handled by the underlying prover and thus will vary between provers as detailed in § 4.
At the graph level, evaluation works by rewriting. A rewrite rule is written as \(L \hookrightarrow R\), which, when applied, will replace a subgraph L with R. The rewriting uses rich pattern graphs [21, 22], which is used to express repetition using ellipses (\(\cdots \)). A formal account of pattern graphs can be found [21, 22] and is beyond the scope of this paper. To increase readability, we omit some of the goal types where they are not used, but note that each wire will have a label even if not shown.
 1.
Consume g from the graph.
 2.
Apply T(Args) to g to obtain a set of results (lists of goals paired with a proof state) from the application of tactic T with the given arguments Args.
 3.
Add all valid combination of the resulting goals to the output wires. These are combined with the new proof state.
 1.
Look up the graph G that T refers to
 2.
Consume g from the graph, and create \(g'\) where the environment is constrained to the variables in Args.
 3.
Add \(g'\) to all the input wires of G where \(g'\) satisfies the goal type (one branch for each), and evaluate until termination. The resulting goals should now be on the output wires.
 4.
Add all valid combinations of the resulting goals to the output wires of T. When adding these, the environment should be replaced by the environment of g, updated by any changes to variables in Args.
In addition to atomic and graph tactics, a graph can also contain identity tactics and breakpoints. These have no side effects on the proof state and are shown as the rewrite ruleset R in Fig. 5. A goal g is simply “moved over” an identity box or breakpoint, as long as the goal type of the target wire is satisfied. A breakpoint is only allowed to have one input and one output wire, labelled by the same goal type. The final type of node is a goal node, and it is not allowed to “swap” two adjacent goal nodes, albeit this may be added in the future to implement heuristicbased evaluation strategies.
A PSGraph is evaluated (to completion) by applying the rewrite rules outlined above until none are applicable. If the termination condition holds at that point, then it has successfully evaluated; if not, evaluation has failed. In § 5 we will discuss a “debugging” mode with a slightly different semantics.
3.1 A formal account of evaluation
To give a formal account of PSGraph, we use a VDMbased mathematical notation [5]. In particular, note that: Open image in new window is the set of all subsets of P; Open image in new window is a sequence of type T, where [] is the empty sequence and Open image in new window adds element x to sequence xs; Open image in new window is the (set of) elements of sequence S; Open image in new window is the domain of relation R; domain subtraction is represented by Open image in new window , while its dual, domain restriction is written Open image in new window .
We do not provide the operational semantics for a goal type (\(PG\vdash g : G\)) as this is given in [28], where the goal type language was introduced.
We use the type Graph to describe an actual graph of a PSGraph, which is an instance of a string diagram [14]. We write \(G[L \hookrightarrow R]\) for the application of the rewrite rule \(L \hookrightarrow R\) to graph G. This will return a set of new graphs. We write \(G_1 \hookrightarrow _R G_2\) for applying a rule in the ruleset R to rewrite \(G_1\) into \(G_2\). Rewriting is achieved by matching L with the graphs we are rewriting. Then this matching subgraph is removed from the graph and replaced by R.
We use two combinators to compose graphs:
Definition 3
(\( {THEN}_G\) combinator) \(G_1\) \(\textit{THEN}_G\) \(G_2\) connects all the outputs of \(G_1\) to all inputs of \(G_2\) of the same type. Diagrammatically this can be seen as follows:
\(\textit{THEN}_G\) is only defined when all outputs/inputs of \(G_1\)/ \(G_2\) are connected.
Horizontal composition is a result of putting the two graphs next to each other:
Definition 4
(\(\otimes \) combinator) \(G_1\) \(\otimes \) \(G_2\) is the horizontal combinator which puts two graphs side by side:
Details of the semantics underpinning both the rewriting and composition are beyond the scope of this paper, and we refer to [14] for details.
We can now define a PSGraph:
Definition 5
\(\Downarrow _R\) is captured by Right, and R is generated by first removing the goal g by rewriting and then applying the tactic T(Args) using the \(\Downarrow _{T}\) relation to get the new goals. From the output goal types, a valid goal type partition is created from the output goals:
Definition 6
(Ordered partition [16]) For a sequence L, we say a sequence of sequences \(L'\) is an ordered partition of L if all the sequences are distinct, \(L'\) contains the same elements as L and each Open image in new window is obtained by deleting zero or more elements of L (i.e. the order of L is preserved).
Definition 7
The set of partitions is empty precisely when there is a goal in L that is not of type \(G_k\) for any k.
Example 1
Each partition is turned into a graph by gnds:
These are then horizontally composed and then sequentially composed with the lefthand side (with the input goal removed). Evaluation of an atomic box (Atomic) is achieved (\(\Downarrow _{T}\) relation) by a function eval that executes the function. This has to be provided by the underlying prover. We return to how this is achieved in § 4, when discussing the Tinker tool.
Example 2
To ilustrate evaluation of an atomic tactic, consider a pair of a PSGraph \(PG{}\) and proof state \(P{}\), such that the current graph of \(PG{}\) is:

For the first element, the goal type partition is given in example 1.

For the second element, the singleton set \(\Big \{\big [[],[g_4]\big ] \Big \}\) is returned.

For the third and final element, the empty set \(\{\}\) is returned. This means that a goal cannot be added to any of the output wires. As a result, this element is discarded.
Note that the rightmost graph has terminated as the goal node is on the output wire.
Environment tactics are used to change the environments; this class of tactics is represented by the set Env, which in practise is any atomic tactic starting with “ENV_”. These are in most cases theorem prover specific and evaluated by the eval_env function, which will return a set of new environments.
For a graph tactic (Graph), the environment is first constrained to the given arguments; then the child graph is evaluated by \(\Downarrow _{GG}^{*}\). On termination, the output goals of the child are returned, once the environment has been updated as previously discussed. AddInput is used to add an input goal to the graph. It follows the same approach as Right by using the two combinators and the gnds function.
We state four key properties, without proof, which follows directly from the semantics:
Claim 1
A goal can only be generated by a tactic application.
Claim 2
No goals are “lost”, meaning that if a tactic generates a goal then it will appear on the graph and remain there until it has been discharged by a tactic.
Claim 3
No goals are duplicated, i.e. there is only a single instance of an open goal in the graph.
Claim 4
Evaluation will only change the current graph of a PSGraph.
4 The Tinker tool^{4}
The Tinker tool [17, 30] implements PSGraph with support for the Isabelle [35], ProofPower [4] and Rodin [1] theorem provers. Tinker consists of two parts: the CORE and the GUI ^{5}. The rightmost shaded box of Fig. 7 is the CORE, while the bottomleft box is the GUI. The CORE implements the main functions of Tinker, using the Quantomatic diagrammatic proof assistant to represent and rewrite graphs [23].
In § 5 we give details of the GUI, focusing on user features for working with the system. In the remainder of this section we will discuss the CORE. Details of how it is implemented and interacts with Quantomatic are described in [17]. Here, we will focus on how theorem proving independence is achieved and how the different provers have been integrated. For a more tutoriallike exposition of how to go about connecting a new prover, we refer to [28].
A theorem prover is connected by providing an implementation of an ML signature called PROVER. Figure 7 shows three implementations of this signature using three different ML structures: Isa_Tinker (Isabelle), PP_Tinker (ProofPower) and Rodin_Tinker (Rodin).
Proof states and goals The crucial part of a prover integration is the representation of proof states and goals. While Tinker uses the underlying prover’s representation, these differ between the various provers. The user therefore has to implement the PROVER signature to represent these, possibly augmented with some additional bookkeeping information. These act as a bridge between Tinker and the theorem prover. The proof state must keep track of the goals that are “active” in the PSGraph, while from Tinker’s point of view a goal is just a named element that can be used to interact with the proof state (e.g. to apply it to a tactic).
In ProofPower, the proof state is represented by an element of type GOAL_STATE. Here, each goal is named so we just use this name for the goal.
Rodin [1] is Eclipsebased and implemented in Java. As shown in Fig. 7, the Rodin integration consists of two components: a Tinker plugin for Rodin and an implementation of the PROVER signature by the Rodin_Tinker structure. These communicates over a JSON protocol. Here, the main functionality is in the plugin and the only thing done by the Rodin_Tinker structure is to handle communication between the Rodin plugin and Tinker CORE. In Rodin, the proof state is handled by the Rodin Proof Obligation Manager (POM).
A key challenge when connecting a theorem prover to Tinker is that its internal representation of tactics has to be ported to work on the representation of proof states and goals from the PROVER signature.
Tactics without arguments (e.g. those in Fig. 3) can be handled by a generic “wrapper” function that turns the underlying prover representation to the one required by Tinker. In Isabelle, a tactic takes as argument the position of the goal^{7}. “The wrapper” looks up the position from the name, then generates a new fresh name for each new goal and updates the map. ProofPower keeps track of the goals in a stack, and a tactic is applied to the goal that is on top of the stack. Thus, to apply a given tactic to a given goal, “the wrapper” first moves the goal to the top of the stack and then applies the tactic. In Rodin, the plugin is responsible for calling the correct tactic in the POM and sends the updated proof status back to the CORE.
If a tactic has arguments, then a user must manually implement a version of this tactic for Tinker where the arguments are represented as a list of a given type (called arg_data). This is a deep embedding of the supported types^{8}. Except for this, it is handled the same way as a tactic without arguments. Environment tactics Environment tactics only update the environment of a goal node. In the semantics of Fig. 6 the execution of an environment tactic is achieved by the \( \textit{eval}\_\textit{env}(T,Args,env) \) function. It will return a set of new environments. These are handled in a similar way as atomic tactics, albeit this is simpler as there is not a proof state to update.
5 Developing and debugging with the GUI^{9}
The Tinker GUI provides users with an additional interface to support the development and debugging of proof strategies; for all other tasks the existing GUI of the underlying theorem prover is used.
5.1 Developing proof strategies
Figure 8 shows the Tinker GUI and its layout. Here, a user can draw a graph from the Graph panel by selecting the type of node from the Drawing and evaluation controls panel. Tactics are connected by dragging a line between them. When selecting an entity, the details are displayed in the Information panel, and they can be edited by double clicking^{10}. It supports all the nodes of Fig. 2, albeit a user cannot draw a goal as this has to be created by the theorem prover.
Tinker allows “boxing” of subgraphs into hierarchies, by a simple mouse click. Tinker also supports a range of features to work with hierarchies. In the Hierarchical node inspector, users can preview the internal structure of an hierarchical node. In the Hierarchy utilities panel (top right of Fig. 8), the hierarchical path of the current graph under edit is shown, as well as a tree view of the hierarchical structure of a PSGraph. It is also easy to move between and edit hierarchical nodes.
Graphical libraries Reuse of PSGraphs is supported by a library. This feature is provided in the Library panel. The items in the library are PSGraphs and stored in a special purpose folder. A new PSGraph can be added to the library by copying the relevant file to this directory. When importing an item from the library to the current PSGraph, Tinker will copy it to the graph that the user is currently editing and merge all the required information, such as defined tactics and goal types (see below).
Recording and replaying Tinker provides features to export PSGraphs and record proofs. A PSGraph can be exported to the SVG format. The recording feature can be switched on/off to start/pause the recording of changes made to a graph. These changes could have been made by the user or by the tool during evaluation. Once completed, such recording can be exported to a lightweight web application (written using HTML, CSS and JavaScript) via a generated JSON file. Examples of recordings can be found at [18].
5.2 Debugging proof strategies
 1.
In an automatic (or normal) mode it is treated as a black box, and all you see is the resulting goals from applying it. This mode is the same as applying a normal tactic/method from the prover. In Isabelle/Isar, a PSGraph \(\langle psgraph \rangle \) is executed by the command \(\texttt {apply tinker} (\langle \texttt {psgraph} \rangle )\); in ProofPower it is achieved by the command apply_ps \(\langle psgraph \rangle \) while the Open image in new window button is available for Rodin.
 2.
In an interactive mode, the steps of evaluation are inspected in the GUI, achieved by the commands apply itinker (Isabelle) and apply_ps_i (ProofPower).
 3.
In a debugging mode, which combines these modes as detailed below.
Breakpoints A novel feature of Tinker is the support for breakpoints, which can be added/removed from wires by a simple mouse click. This was introduced in [28]. In presence of breakpoints in the debugging mode, we introduce a new definition of termination, with the difference from Definition 2 underlined:
Definition 8
(Termination (debug mode)) A graph has terminated, if for all goals g of the graph, the destination of the output wire of g is either the graph output, a breakpoint node, or another goal.
The semantics is updated to handle this mode by removing the rewrite rule for breakpoints from R (see Fig. 5).
where ENV_DATA is a tag for the environment of the goal which is currently evaluated. We will see practical use of the logging mechanism in § 6.
6 Proof patterns as PSGraphs
To illustrate use of PSGraph and Tinker we encode three proof patterns of increasing complexity.
6.1 “Disjunctions to the top”

match(P, T) holds if term T matches pattern P. This uses Isabelle’s patternmatching capabilities for terms.

\(into\_ex(X,Z)\) steps into the body of an existential. For example, for \(into\_ex(\exists {x}{P},Z)\), Z will be bound to P.
6.2 Existential quantifiers via the “one point rule”
Here, we develop a proof strategy for instantiating existentially bound variables through a technique known as the one point rule[15, 40].

first find the term to be used as a witness;

then move the corresponding existential binder to the top, if it is not already at the top;

finally, instantiate the binder with the witness.

is_one_point(X) holds if the one point rule is applicable, i.e. X has the shape required;

is_top(K,T) holds if term \(x = K\) or term \(K = x\) is a subterm of T, and x is existentially bound at the toplevel;

depth(K,T,D) holds if term \(x = K\) or term \(K = x\) is a subterm of T, and x has D preceding existential binders;

less(X,Y) implements the order \(X < Y\).

ENV_one_point_match(T,V) applies the “one point rule match” by finding the (first) instance of \(x=K\) or \(K=x\) in T, such that x is existentially bound (and only preceded by other existential binders). It will then bind term K to V in the goal type environment;

ENV_exists_depth(T,K,N) will assign to N the number of binders preceding the binder of x in T, with \(x = K\) or \(K = x\).
6.3 Rippling
Figure 12 shows such an encoding of rippling in PSGraph. It is based upon [27] and extended with a limited form of a technique called piecewise fertilisation [3] (graph tactic pwf_forall_conj).
In our case, strong fertilisation will discharge the goal. This concludes rippling and illustrates how to represent a complex and wellknown proof technique in PSGraph. We believe that the graphical view helps in explaining how this technique works.
7 Related work
This paper extends and builds on [30], where we introduced new features of the Tinker GUI. We also build on [16, 17], where the Tinker tool and PSGraph were first introduced. In [28] we develop the goal types and environment tactics used here, as well as the breakpoints and logging features—with the formal semantics for evaluation with breakpoints and environments tactics given here. [28] also contains an empirical study of PSGraph and Tinker via a set of reengineered case studies developed in ProofPower. This can be contrasted to this paper, where we show how to implement existing proof patterns from scratch in PSGraph. In [29], we show industrial application of PSGraph with ongoing work of implementing DRisQ Software Systems’ (www.drisq.com) highly complex Supertac tactic in ProofPower [32]. Several of the features described here have been motivated by this work. We have also previously developed a Rodin version of Tinker [25], which we have briefly described in Sect. 4.
Tinker is built on top of the Quantomatic graphrewriting engine [23], which is used internally as a library function. There is also a webbased version of Tinker, which supports a subset of the GUI features discussed here [7]. There has been a considerable amount of work on visualising proof trees, including: L\(\Omega \)UI [37] for \(\Omega \)mega; XIsabelle [33] for Isabelle; ProveEasy [11] and Jape [6] for teaching; and some more recent work for Mizar [26, 34]. However, none of these visualise the highlevel strategy. Moreover, in PSGraph the diagram is not just a visualisation of the proof strategy—it is the proof strategy! Bundy [9] has argued for the role of explanation for proof strategies, and we hope that we have shown how PSGraph can help explaining the strategy of a proof in addition to be used to guide the search. Such explanation is important for maintenance when team changes, and our work with DRisQ [29] has had very promising initial results when porting their proof tactics from Ada to C verification.
While there are tactic languages that support robust and userfriendly tactics (e.g. Ltac [12] for Coq and Eisbach [31] for Isabelle), we believe that the development and debugging features of Tinker are novel. The most relevant tool that we are familiar with to support debugging tactics is the Tactician tool for HOL light [2]. In HOL light, a proof can be a sequence of (interactive) “apply” steps, or they can be combined into a single step (by means of tactic combinators) which is then applied. Tactician is a tool to fold sequences into a single tactic and unfold a tactic into a sequence of steps. This can then be used for debugging by enable users to step through a large tactic, similar to how this can be achieved with Tinker. However, it only works for a small subset of ML and it is not clear how this approach can be generalised to arbitrary tactics. Moreover, it unfolds only one particular branch of the proof which does not necessarily reflect the underlying proof strategy. Another tool recently developed to support debugging is a reasonably new tracing mechanism for the simp tactic in Isabelle [19]. This is implemented as a plugin for the Isabelle/jEdit Prover IDE. It supports hierarchical viewing of simplification traces, and, as with Tinker, it enables breakpoints to be inserted where the user can step through and interact with the tactic. The breakpoints can either be an application of certain theorems or if the goal matches certain patterns. Note that it is not used to debug the (sub)tactics used to implement the simplifier: it will only show how the simplifier applies rewrite steps. Our logging mechanism is considerably simpler and closer to the more rudimentary ones supported in other ITP systems (including the previous tracing mechanism for the simp tactic in Isabelle). However, in practice we have found that our logging mechanism is sufficient as it only relates to a step at a time, while the simp tactic could involve hundreds of steps.
Rippling has been implemented in several systems, the closest being IsaPlanner [13]. In his Ph.D. thesis, Lin addressed rippling for EventB invariant proofs [27], which is comparable to the Z representation used here. This was implemented in IsaPlanner, which has a similar composition language to tacticals that we used to motivate PSGraph in § 1. Finally, in [16] we developed a very basic and ad hoc PSGraph version of rippling, which we have considerably improved upon and extended here. One key difference is the more sophisticated goal type language, which enables us to make more details of rippling declarative and available to the users (as opposed to “hidden” in the boxes).
8 Conclusion and future work
We have given a detailed and formal account of the semantics of PSGraph and shown how this is implemented in Tinker, with support for multiple interactive theorem provers. We have shown how to develop and debug proof strategies using the GUI of Tinker and illustrated use of the language and tool by encoding several existing proof strategies from scratch in PSGraph.
In the future, we would like to improve static checking of PSGraph, such as being able to validate a PSGraph before evaluation. We also plan to improve the layout algorithm and develop and implement a better framework for combining evaluation and user edits of PSGraphs. We are also working on a more lightweight version of Tinker (independent of the underlying Quantomatic tool), which will make it easier to install and maintain. There have been recent advances in the expressiveness of pattern graphs [22, 23], which will enable us to simplify the graphical parts of the evaluation. We are also working on improving features for parametrised graph tactics, in order to improve reuse.
Footnotes
 1.
See, e.g. [29], which addresses industrial use of PSGraph.
 2.
We use goal for both the toplevel goal and any subgoal.
 3.
The brain also finds diagrammatic representations more natural to understand for such “process systems” [24].
 4.
The Tinker source code is available from [18].
 5.
 6.
Isabelle has a subgoal package, used, e.g. by Eisbach [31], which we plan to use in the future.
 7.
Many Isabelle tactics also expect a proof context which we for simplicity ignore here.
 8.
See [28] for a detailed example.
 9.
Example screen casts can be found at [18].
 10.
More details of running the tool is available from the user manual [8].
 11.
The idea of using such term difference is similar to the PRESS system for equational theories [38].
 12.
If this was part of a larger tactic, then we could have used a breakpoint to start at the entry of this part of the strategy.
 13.
We assume absence of circularity.
Notes
Acknowledgements
An initial version of the one point rule in PSGraph was developed with Iain Whiteside. Thanks to Pierre Le Bras, who implemented the Tinker GUI [30], Aleks Kissinger, Rob Arthan, Colin O’Halloran and members of the AI4FM project for valuable discussions.
References
 1.Abrial, J.R., Butler, M., Hallerstede, S., Hoang, T.S., Mehta, F., Voisin, L.: Rodin: an open toolset for modelling and reasoning in EventB. Int. J.Softw. Tools Technol. Transf. 12(6), 447–466 (2010)CrossRefGoogle Scholar
 2.Adams, M.: Refactoring proofs with tactician. In: Bianculli, D., Calinescu, R., Rumpe, B. (eds.) Software Engineering and Formal Methods: SEFM 2015, Revised Selected Papers, pp. 53–67. Springer, Berlin (2015)CrossRefGoogle Scholar
 3.Armando, A., Smaill, A., Green, I.: Automatic synthesis of recursive programs: the proofplanning paradigm. Autom. Softw. Eng. 6(4), 329–356 (1999)CrossRefGoogle Scholar
 4.Arthan, R., Jones, R.B.: Z in HOL in ProofPower. BCS FACS FACTS (20051). http://www.bcs.org/upload/pdf/facts200503.pdf
 5.Bjørner, D., Jones, C.B. (eds.): The vienna development method: the metalanguage. Lecture Notes in Computer Science, vol. 61. Springer (1978). doi: 10.1007/3540087664
 6.Bornat, R., Sufrin, B.: Jape: A calculator for animating proofonpaper. In: McCune, W. (ed.) CADE14, pp. 412–415. Springer, Berlin (1997). doi: 10.1007/3540631046_41
 7.Bras, P.L.: Web based interface for graphical proof strategies (2015). Undergraduate CS Honours Thesis. https://goo.gl/LWG522
 8.Bras, P.L., Grov, G., Lin, Y.: Tinker: user guide. http://ggrov.github.io/tinker/userGuides.pdf
 9.Bundy, A.: A science of reasoning. In: de Swart, H. (ed.) International Conference on Automated Reasoning with Analytic Tableaux and Related Methods, pp. 10–17. Springer, Berlin (1998)Google Scholar
 10.Bundy, A., Basin, D., Hutter, D., Ireland, A.: Rippling: Metalevel Guidance for Mathematical Reasoning, Cambridge Tracts in Theoretical Computer Science, vol. 56. Cambridge University Press, Cambridge (2005)CrossRefzbMATHGoogle Scholar
 11.Burstall, R.: Proveeasy: Helping people learn to do proofs. ENTCS 31, 16–32 (2000). doi: 10.1016/S15710661(05)803275 zbMATHGoogle Scholar
 12.Delahaye, D.: A Proof Dedicated MetaLanguage. Electronic Notes in Theoretical Computer Science 70(2), 96–109 (2002)CrossRefzbMATHGoogle Scholar
 13.Dixon, L., Fleuriot, J.: Higher Order Rippling in IsaPlanner. In: Slind, K., Bunker, A., Gopalakrishnan, G. (eds.) TPHOL, pp. 83–98. Springer, Berlin (2004)Google Scholar
 14.Dixon, L., Kissinger, A.: Open graphs and monoidal theories. Math. Struct. Comput. Sci. 23(2), 308–359 (2013). doi: 10.1017/S0960129512000138
 15.Freitas, L., Whiteside, I.: proof patterns for formal methods. In: International Symposium on Formal Methods, pp. 279–295. Springer, Berlin (2014)Google Scholar
 16.Grov, G., Kissinger, A., Lin, Y.: A graphical language for proof strategies. In: McMillan, K., Middeldorp, A., Voronkov, A. (eds.) LPAR, pp. 324–339. Springer, Berlin (2013)Google Scholar
 17.Grov, G., Kissinger, A., Lin, Y.: Tinker, tailor, solver, proof. In: Benzmüller, C., Paleo, B.W. (eds.) UITP 2014, EPTCS, vol. 167, pp. 23–34. Open Publishing Association, London (2014)Google Scholar
 18.Grov, G., Lin, Y.: Tinker webpage—resources for STTT paper. http://ggrov.github.io/tinker/sttt16/. Accessed Feb 2017
 19.Hupel, L.: Interactive simplifier tracing and debugging in Isabelle. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM, pp. 328–343. Springer, Berlin (2014)Google Scholar
 20.Jones, C.B., Shaw, R.C.: Case Studies in Systematic Software Development. Prentice Hall, Upper Saddle River (1990)zbMATHGoogle Scholar
 21.Kissinger, A., Merry, A., Soloviev, M.: Pattern graph rewrite systems. CoRR arXiv:1204.6695 (2012)
 22.Kissinger, A., Zamdzhiev, V.: Equational reasoning with contextfree families of string diagrams. In: ParisiPresicce, F., Westfechtel, B. (eds.) Graph Transformation, pp. 33–47. Springer, Berlin (2015)Google Scholar
 23.Kissinger, A., Zamdzhiev, V.: Quantomatic: a proof assistant for diagrammatic reasoning. In: Felty, A.P., Middeldorp, A. (eds.) CADE25, LNCS, vol. 9195, pp. 326–336. Springer, Berlin (2015)Google Scholar
 24.Larkin, J.H., Simon, H.A.: Why a diagram is (sometimes) worth ten thousand words. Cognit. Sci. 11(1), 65–100 (1987)CrossRefGoogle Scholar
 25.Liang, Y., Lin, Y., Grov, G.: The Tinker for rodin. In: Butler, M., Schewe, K.D., Mashkoor, A., Biro, M. (eds.) 5th International Conference on Abstract State Machines, Alloy, B, TLA, VDM, pp. 262–268. Springer, Berlin (2016)CrossRefGoogle Scholar
 26.Libal, T., Riener, M., Rukhaia, M.: Advanced proof viewing in ProofTool. In: Benzmüller , C., Woltzenlogel, B. (eds.) UITP 2014, EPTCS, vol. 167, pp. 35–47. Open Publishing Association, London (2014)Google Scholar
 27.Lin, Y.: The use of rippling to automate eventB invariant preservation proofs. Ph.D. thesis, The University of Edinburgh (2015)Google Scholar
 28.Lin, Y., Grov, G., Arthan, R.: Understanding and maintaining tactics graphically OR how we are learning that a diagram can be worth more than 10K LoC. J. Formaliz. Reason. 9(2), 69–130 (2016)MathSciNetGoogle Scholar
 29.Lin, Y., Grov, G., O’Halloran, C.: G., P.: A Super Industrial Application of PSGraph. Butler, M.J., Schewe, KD., Mashkoor, A., Biró M. (eds.) 5th International Conference on Abstract State Machines. Alloy, B, TLA, VDM, and Z, pp. 319–325. Springer, Berlin (2016)Google Scholar
 30.Lin, Y., Le Bras, P., Grov, G.: Developing and debugging proof strategies by Tinkering. In: Chechik, M., Raskin, J.F. (eds.) Tools and Algorithms for the Construction and Analysis of Systems, pp. 573–579. Springer, Berlin (2016)Google Scholar
 31.Matichuk, D., Wenzel, M., Murray, T.: An Isabelle proof method language. In: Klein, G., Gamboa, R. (eds.) 5th International Conference on Interactive Theorem Proving, pp. 390–405. Springer, Cham (2014)Google Scholar
 32.O’Halloran, C.: Automated verification of code automatically generated from simulink. ASE 20(2), 237–264 (2013)Google Scholar
 33.Ozols, M.A., Cant, A., Eastaughffe, K.A.: Xisabelle: A system description. In: McCune, W. (ed.) CADE14, pp. 400–403. Springer, Berlin (1997)Google Scholar
 34.Pak, K.: The algorithms for improving and reorganizing natural deduction proofs. Stud. Logic Gramm. Rhetor. 22(35), 95–112 (2010)Google Scholar
 35.Paulson, L.C.: Isabelle: A Generic Theorem Prover, vol. 828. Springer, Berlin (1994)CrossRefzbMATHGoogle Scholar
 36.Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. IWIL2010 1, 1–11 (2010)Google Scholar
 37.Siekmann, J.H., Hess, S.M., Benzmüller, C., Cheikhrouhou, L., Fiedler, A., Horacek, H., Kohlhase, M., Konrad, K., Meier, A., Melis, E., Pollet, M., Sorge, V.: LOUI: lovely OMEGA user interface. Form. Asp. Comput. 11(3), 326–342 (1999)CrossRefGoogle Scholar
 38.Sterling, L., Bundy, A., Byrd, L., O’Keefe, R., Silver, B.: Solving symbolic equations with PRESS. In: Calmet, J. (ed.) Computer Algebra, no. 144 in LNCS, pp. 109–116. Springer, Berlin (1982). Also available in J. Symbol. Comput. 7, pp 71–84 (1989)Google Scholar
 39.Wenzel, M.: Isabelle/Isar – a versatile environment for humanreadable formal proof documents. Ph.D. thesis, Technische Universität München (2002)Google Scholar
 40.Woodcock, J., Davies, J.: Using Z: Specification, Refinement, and Proof, vol. 39. Prentice Hall, Upper Saddle River (1996)zbMATHGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.