Language Definitions as Rewrite Theories
Abstract
\(\mathbb {K}\)is a formal framework for defining the operational semantics of programming languages. It includes software tools for compiling \(\mathbb {K}\)language definitions to Maude rewrite theories, for executing programs in the defined languages based on the Maude rewriting engine, and for analyzing programs by adapting various Maude analysis tools. A recent extension to the \(\mathbb {K}\)tool suite is an automatic transformation of language definitions that enables the symbolic execution of programs, i.e., the execution of programs with symbolic inputs. In this paper we investigate the theoretical relationships between \(\mathbb {K}\)language definitions and their translations to Maude, between symbolic extensions of \(\mathbb {K}\)definitions and their Maude encodings, and how the relations between \(\mathbb {K}\)definitions and their symbolic extensions are reflected on their respective representations in Maude. These results show, in particular, how analyses performed with Maude tools can be formally lifted up to the original language definitions.
Keywords
Path Condition Symbolic Execution Reachability Analysis Ground Term Rewrite Theory1 Introduction
\(\mathbb {K}\)[11] is a framework for formally defining the semantics of programming languages. The current version of \(\mathbb {K}\)includes options that have Maude [3] as a backend: the \(\mathbb {K}\)compiler transforms any \(\mathbb {K}\)definition into a Maude module; then, the \(\mathbb {K}\)runner uses Maude to run or analyze programs in the defined language.
Recently, \(\mathbb {K}\)has been extended with symbolic execution support [2]. Briefly, a \(\mathbb {K}\)language definition is automatically transformed into a symboliclanguage definition, such that the concrete executions of programs using the symbolic definition are symbolic executions of programs using the original language definition. The transformation amounts to incorporating path conditions in program configurations, and to changing the language’s semantic rules so that they match on symbolic configurations and that they automatically update the path conditions.
Symbolic executions are called feasible if their path conditions are satisfiable. Two results relating concrete and symbolic program executions are proved in [2]: coverage, saying that for each concrete execution there is a feasible symbolic one taking the same path on the program; and precision, saying that for each feasible symbolic execution there is a concrete one taking the same program path.
In the faithful encoding, each semantic rule of the language definition \(\mathcal {L}\) is translated into a rewrite rule of the rewrite theory \(\mathcal {R}(\mathcal {L})\). Equations are only introduced in order to express equality in the data domain. The resulting rewrite theory is proved to be executable by Maude, and the transition system generated by the language definition is shown to be isomorphic to the one generated by the rewrite theory. Some variations of this encoding are also discussed, all of which satisfy the executability and faithfulness properties. As a consequence, both positive and negative results of reachability analyses, obtained on rewrite theories (i.e., by using the Maude search command) also hold on the original language definitions. Moreover, all symbolic reachability analysis results obtained on the rewritetheory representation \(\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})\) of a symbolic language \({\mathcal {L}}^{\mathfrak {s}}\) also hold on the rewritetheory representation \(\mathcal {R}(\mathcal {L})\) of the language \(\mathcal {L}\). The latter property is analoguous to the results obtained in [10], where rewriting modulo SMT is shown to be related to (usual) rewriting in a sound and complete way.
For nontrivial language definitions, the faithful encoding is not very practical, because it typically generates a huge statespace that is not amenable to reachability analysis. This is why we introduce approximate representations of language definitions as twolayered rewrite theories. These approximations are obtained by splitting the semantic rules of the language into two sets, called layers, such that the first layer forms a terminating rewrite system. The onestep rewriting in such a theory is obtained by computing an irreducible form w.r.t. rules from the first layer (according to a given strategy), and then applying a rule from the second layer. A simple example of a twolayered rewrite theory is a Maude module consisting of equations and rules, where the equations (denoting the first layer) are only required to be terminating, and both the equations and rules (which form the second layer) specify transitions in the underlying transitionsystem model of the theory.
In an (approximating) twolayered rewrite theory \(\mathfrak {R}(\mathcal {L})\), only a subset of the executions of programs in the original language \(\mathcal {L}\) are represented. The consequence is that only positive results of reachability analyses on the twolayered rewrite theories can be lifted up to the corresponding language definitions. In addition to reducing the statespace to be explored, the approximate encoding of a language by a twolayered rewrite theory can also be seen as the output of a compiler that solves some semantic choices left by the language definition at compiletime. For example, in C, the order in which the operands of addition are evaluated is a compiletime choice. By turning the operandevaluation rules into firstlayer rules, and by letting Maude automatically execute these rules in various orders according to certain strategies, one can reproduce the various design compiletime choices for the evaluation of arguments.
We note that approximating twolayered rewrite theories have some limitations: only the coverage property relating the language definition \(\mathcal {L}\) to its symbolic version \({\mathcal {L}}^{\mathfrak {s}}\) also holds on their respective approximate encodings theories; the precision property holds only in some restricted cases. However, the precision property between the approximate symbolic encoding \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\) and the language definition \(\mathcal {L}\) always holds. Hence, one can trace symbolic reachability analyses (performed on \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\)) back to programs in \(\mathcal {L}\), and also (in some restricted cases) to the representation of programs in \(\mathfrak {R}(\mathcal {L})\), which, as discussed above, can be seen as compiled programs where some semantic choices are left to the compiler.
Organisation. In Sect. 2 we present our working examples, which are two programs belonging to the CinK kernel of Open image in new window , which was specified in \(\mathbb {K}\)[7]. A partial description of the \(\mathbb {K}\)definition for CinK is included. In Sect. 3 we introduce a formal notion of a languagedefinition framework, which allows us to make our approach independent of the \(\mathbb {K}\)language definitional framework and to abstract away some particular implementation details of \(\mathbb {K}\). For the same reason, we will be using rewrite theories (instead of their implementations as Maude modules) for the encodings of language definitions. We also briefly present the languageindependent symbolic execution approach [2] and recap some essential notions related to the executability of rewrite theories.
Section 4 presents the faithful and the approximate representations of language definitions into a rewrite theory and the various relations between them (graphically depicted in the above diagram). Section 5 presents the applications of these representations to the compilation of \(\mathbb {K}\)language definitions as Maude modules. Finally, Sect. 6 presents conclusions and related work.
2 Running Example
A major feature of Open image in new window expressions is that given by the “sequenced before” relation [1], which defines a partial order over the evaluation of subexpressions. This can be easily expressed in \(\mathbb {K}\)using the strict attribute to specify an evaluation order for an operation’s operands. If the operator is annotated with the strict attribute then its operands will be evaluated in a nondeterministic order. For instance, all the binary operations are strict. Hence, they may induce nondeterminism in programs because of possible sideeffects in their arguments.
Another feature is given by the classification of expressions into rvalues and lvalues. The arguments of binary operations are evaluated as rvalues and their results are also rvalues, while, e.g., both the argument of the prefixincrement operation and its result are lvalues. The strict attribute for such operations has a subattribute context for wrapping any subexpression that must be evaluated as an rvalue. Other attributes (\( funcall, divide, plus, minus, \dots \)) are names associated to each syntactic production, which can be used for referring to them.
The \(\mathbb {K}\)framework uses configurations to store program states. A configuration is a nested structure of cells, which typically include the program to be executed, input and output streams, values for program variables, and other additional information. The configuration of CinK (Fig. 2) includes the \({\langle \rangle }{}_\mathsf{k}\) cell containing the code that remains to be executed, which is represented as a list of computation tasks \(C_1\curvearrowright C_2\curvearrowright \ldots \) to be executed in the given order. Computation tasks are typically statements and expression evaluations. The memory is modeled using two cells \({\langle \rangle }{}_\mathsf{env}\) (which holds a map from variables to addresses) and \({\langle \rangle }{}_\mathsf{state}\) (which holds a map from addresses to values). The configuration also includes a cell for the function call stack and another one for the return values of functions.
where \(\mathtt{+ }_{Int}\) is the mathematical operation for addition. Note that the ellipses in a cell (e.g., \({\langle {\;}{\cdot }{\cdot }{\cdot } \rangle }{}_\mathsf{k}\)) represent the part of the cell not affected by the rule.
The rule for division has a side condition which restricts its application. The conditional statement \(\mathtt{if }\) has two corresponding rules, one for each possible evaluation of the condition expression. The rule for the \(\mathtt{while }\) loop is unrolled into an \(\mathtt{if }\) statement. The increment and update rules have side effects in the \({\langle \rangle }{}_\mathsf{store}\) cell, modifying the value stored at a specific address. Finally, the reading of a value from the memory is specified by the lookup rule, which matches a value in the \({\langle \rangle }{}_\mathsf{store}\) and places it in the \({\langle \rangle }{}_\mathsf{k}\) cell. The auxiliary construct \(\mathtt{{\$lookup} }\) is used, e.g., when a program variable is evaluated as an rvalue.
where \(\square \) is a special symbol, destined to receive the result of an evaluation.
3 Background
3.1 The Ingredients of a Language Definition
 1.
A manysorted algebraic signature \(\varSigma \), which includes at least a sort \( Cfg \) for configurations and a sort \( Bool \) for constraint formulas. For the sake of presentation, we assume in this paper that the constraint formulas are Boolean terms built with a subsignature \(\varSigma ^{\mathsf {Bool}} \subseteq \varSigma \) including the boolean constants and operations. \(\varSigma \) may also include other subsignatures for other data sorts, depending on the language \(\mathcal {L}\) (e.g., integers, identifiers, lists, maps,...). Let \(\varSigma ^\mathsf {Data}\) denote the subsignature of \(\varSigma \) consisting of all data sorts and their operations. We assume that the sort \( Cfg \) and the syntax of \(\mathcal {L}\) are not data, i.e., they are defined in \(\varSigma \setminus \varSigma ^\mathsf {Data}\). Let \(T_\varSigma \) denote the \(\varSigma \)algebra of ground terms and \(T_{\varSigma ,s}\) denote the set of ground terms of sort \(s\). Given a sortwise infinite set of variables \( Var \), let \(T_\varSigma ( Var )\) denote the free \(\varSigma \)algebra of terms with variables, \(T_{\varSigma ,s}( Var )\) denote the set of terms of sort \(s\) with variables, and \( var (t)\) denote the set of variables occurring in the term \(t\).
 2.
A \(\varSigma ^\mathsf {Data}\)model \(\mathcal {D}\), which interprets the data sorts and operations. For convenience, we assume that \(\mathcal {D}_d\subset \varSigma _{d}\) for each data sort \(d\), i.e., the constants are elements of the corresponding signature. Let \(\mathcal {T}\triangleq \mathcal {T}(\mathcal {D})\) denote the free \(\varSigma \)model generated by \(\mathcal {D}\). The satisfaction relation \(\rho \;\models \;b\) between valuations \(\rho \) and constraint formulas \(b\in T_{\varSigma , Bool }( Var )\) is defined by \(\rho \;\models \; b\) iff \(\rho (b)= {\mathcal {D}}_{{ true }}\). For simplicity, we write \({ true },{ false }, 0, 1\ldots \) instead of \({\mathcal {D}}_{{ true }}, {\mathcal {D}}_{{ false }}, {\mathcal {D}}_0, {\mathcal {D}}_1, \ldots \).
 3.
A set \(\mathcal {S}\) of rewrite rules. Each rule is a pair of the form \({l}\pmb {\wedge }{b}\;\pmb {\Rightarrow }\;{r} \), where \(l,r\in T_{\varSigma , Cfg }( Var )\) are the rule’s lefthandside and righthandside, respectively, and \(b\in T_{\varSigma , Bool }( Var )\) is the condition. The formal definitions for rules and for the transition system defined by them are given below.
Remark 1
For the sake of presentation, here we consider only “pure” language definitions, where the semantics is given only by semantic rules between configurations. Some definitions may include additional functions defined by equations. For such cases the language definition may additionally includes a set of axioms \(A_0\), e.g., associativity and/or commutativity of some functions, and a set of equations \(E_0\). Then the model \(\mathcal {T}\) is the free algebra modulo \(A_0\cup E_0\). We believe that the approach presented in this paper can be extended to these more involved definitions, but this requires more investigation and is left for future work.
We now formally introduce the notions required for defining semantic rules.
Definition 1
(pattern [12]). A pattern is an expression of the form \({\pi }\;\pmb {\wedge }\;{b}\), where \(\pi \in T_{\varSigma , Cfg }( Var )\) is a basic pattern and \(b\in T_{\varSigma , Bool }( Var )\). If \(\gamma \in T_ Cfg \) and \(\rho \,{:} Var \rightarrow \mathcal {T}\) then we write \((\gamma ,\rho )\;\models \;{\pi }\;\pmb {\wedge }\;{b}\) iff \(\gamma =\rho (\pi )\) and \(\rho \;\models \; b\).
A basic pattern \(\pi \) defines a set of (concrete) configurations, and the condition \(b\) gives additional constraints these configurations must satisfy.
Remark 2
The above definition is a particular case of a definition in [12]. There, a pattern is a firstorder logic formula with configuration terms as subformulas. In this paper we keep the conjunction notation from firstorder logic but separate basic patterns from constraints. Note that firstorder formulas can be encoded as terms of sort \(Bool\), where the quantifiers become constructors. The satisfaction relation \(\models \) is then defined, for such terms, like the usual FOL satisfaction.
We identify basic patterns \(\pi \) with patterns \({\pi }\;\pmb {\wedge }\;{{ true }}\). Sample patterns are Open image in new window and Open image in new window .
Definition 2
(rule, transition system). A rule is a pair of patterns of the form \({l}\pmb {\wedge }{b}\;\pmb {\Rightarrow }\;{r} \) (note that \(r\) is in fact the pattern \({r}\;\pmb {\wedge }\;{{ true }}\)). Any set \(\mathcal {S}\) of rules defines a labelled transition system \((\mathcal {T}_ Cfg , \Rightarrow _{\mathcal {S}})\) such that \(\gamma \mathop {\Longrightarrow }\limits ^{\alpha }\mathop {_{\mathcal {S}}}\limits ^{}\gamma '\) iff there exist \(\alpha \triangleq ({l}\pmb {\wedge }{b}\;\pmb {\Rightarrow }\;{r} ) \in \mathcal {S}\) and \(\rho : Var \rightarrow \mathcal {T}\) such that \((\gamma ,\rho )\;\models \; {l}\;\pmb {\wedge }\;{b}\) and \((\gamma ',\rho )\;\models \;r\).
3.2 Symbolic Execution
We briefly recap our approach to symbolic execution from [2]. The main idea is to automatically generate a new definition \(({\varSigma }^{\mathfrak {s}},{\mathcal {T}}^{\mathfrak {s}},{\mathcal {S}}^{\mathfrak {s}})\) for a language \({\mathcal {L}}^{\mathfrak {s}}\) from a given definition \((\varSigma , \mathcal {T}, \mathcal {S})\) of a language \(\mathcal {L}\). The new language \({\mathcal {L}}^{\mathfrak {s}}\) has the same syntax, and its semantics extends \(\mathcal {L}\)’s data domains with symbolic values and adapts the semantical rules of \(\mathcal {L}\) to deal with the new domains.
Let \({V}^{\mathfrak {s}}\) denote an infinite, data sortwise set of symbolic values, disjoint from \( Var \) and from symbols in \(\varSigma \). The data algebra is extended to \({\mathcal {D}}^{\mathfrak {s}}\), which is the algebra of ground terms over the signature \(\varSigma ^\mathsf {Data}({V}^{\mathfrak {s}})\).
Remark 3
The approach in [2] allows some freedom in choosing the algebra \({\mathcal {D}}^{\mathfrak {s}}\), to enable the use of decision procedures for handling symbolic artifacts.
The signature \({\varSigma }^{\mathfrak {s}}\) extends \(\varSigma \) with the symbolic values \({V}^{\mathfrak {s}}\) as constants, a new sort \({ Cfg }^{\mathfrak {s}}\) and a constructor \({\_}\;\pmb {\wedge }\;{\_}: Cfg \times Bool \rightarrow { Cfg }^{\mathfrak {s}}\). The model \({\mathcal {T}}^{\mathfrak {s}}\) is defined as being the free \({\varSigma }^{\mathfrak {s}}\)model generated by \({\mathcal {D}}^{\mathfrak {s}}\), similarly to how \(\mathcal {T}\) is built over \(\mathcal {D}\). The ground terms \({\pi }\;\pmb {\wedge }\;{\phi }\in {\mathcal {T}}^{\mathfrak {s}}_{{ Cfg }^{\mathfrak {s}}}\) are called symbolic configurations. Let \([\![{\pi }\;\pmb {\wedge }\;{\phi }]\!]\) denote the set of concrete configurations \(\{\gamma \mid (\exists \rho )\,(\gamma ,\rho )\;\models \; {\pi }\;\pmb {\wedge }\;{\phi }\}\).
Then, the symbolic execution of \({\mathcal {L}}\) programs is the concrete execution of the corresponding \({\mathcal {L}}^{\mathfrak {s}}\) programs, i.e., the application of the rewrite rules in the semantics of \({\mathcal {L}}^{\mathfrak {s}}\). Building the definition of \({\mathcal {L}}^{\mathfrak {s}}\) amounts to extending the signature \(\varSigma \) to a symbolic signature \({\varSigma }^{\mathfrak {s}}\), extending the \(\varSigma \)algebra \(\mathcal {T}\) to a \({\varSigma }^{\mathfrak {s}}\)algebra \({\mathcal {T}}^{\mathfrak {s}}\), and turning the concrete rules \(\mathcal {S}\) into symbolic rules \({\mathcal {S}}^{\mathfrak {s}}\). The transition system \(({\mathcal {T}}^{\mathfrak {s}}_{{ Cfg }^{\mathfrak {s}}},\Rightarrow _{{\mathcal {S}}^{\mathfrak {s}}})\) is defined using Definitions 1, 2 applied to \({\mathcal {L}}^{\mathfrak {s}}\). In [2] it is proved that the symbolic transition system forwardsimulates the concrete one, and that the concrete transition system backwardsimulates the symbolic one. These two results then imply the naturally expected properties of symbolic execution.
Theorem 1
(Coverage [2]). For every concrete execution \(\gamma _0 \mathop {\Longrightarrow }\limits ^{\alpha _1}\mathop {_{\mathcal {S}}}\limits ^{} \gamma _1 \mathop {\Longrightarrow }\limits ^{\alpha _2}\mathop {_{\mathcal {S}}}\limits ^{} \cdots \mathop {\Longrightarrow }\limits ^{\alpha _n}\mathop {_{\mathcal {S}}}\limits ^{} \gamma _n \mathop {\Longrightarrow }\limits ^{\alpha _{n+1}}\mathop {_{\mathcal {S}}}\limits ^{} \cdots \) there is a symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \mathop {\Longrightarrow }\limits ^{\alpha _1}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} {\pi _1}\;\pmb {\wedge }\;{\phi _1} \mathop {\Longrightarrow }\limits ^{\alpha _2}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} \cdots \mathop {\Longrightarrow }\limits ^{\alpha _n}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \mathop {\Longrightarrow }\limits ^{\alpha _{n+1}}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
A symbolic configuration \({\pi }\;\pmb {\wedge }\;{\phi }\in {\mathcal {T}}^{\mathfrak {s}}_{{ Cfg }^{\mathfrak {s}}}\) is satisfiable if there is a valuation \(\vartheta :{V}^{\mathfrak {s}}\rightarrow \mathcal {D}\) such that \(\vartheta \;\models \; \phi \) (which is equivalent to \([\![{\pi }\;\pmb {\wedge }\;{\phi }]\!]\not =\emptyset \)). We call a symbolic execution feasible if all its configurations are satisfiable.
Theorem 2
(Precision [2]). For every feasible symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \mathop {\Longrightarrow }\limits ^{\alpha _1}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} {\pi _1}\;\pmb {\wedge }\;{\phi _1} \mathop {\Longrightarrow }\limits ^{\alpha _2}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} \cdots \mathop {\Longrightarrow }\limits ^{\alpha _n}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \mathop {\Longrightarrow }\limits ^{\alpha _{n+1}}\mathop {_{{\mathcal {S}}^{\mathfrak {s}}}}\limits ^{} \cdots \) there is a concrete execution \(\gamma _0 \mathop {\Longrightarrow }\limits ^{\alpha _1}\mathop {_{\mathcal {S}}}\limits ^{} \gamma _1 \mathop {\Longrightarrow }\limits ^{\alpha _2}\mathop {_{\mathcal {S}}}\limits ^{} \cdots \mathop {\Longrightarrow }\limits ^{\alpha _n}\mathop {_{\mathcal {S}}}\limits ^{} \gamma _n \mathop {\Longrightarrow }\limits ^{\alpha _{n+1}}\mathop {_{\mathcal {S}}}\limits ^{} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
3.3 Rewrite Theories
 1.
there exists a matching algorithm modulo \(A\);
 2.
\((\varSigma ,E\cup A)\) is ground ChurchRosser and terminating modulo \(A\) (the equations \(E\) are seen here as rewrite rules oriented from left to right). Thus, each ground term \(t\) has a canonical form \( can _{E/A}(t)\) that is unique modulo the axioms \(A\);
 3.
\(R\) is ground coherent w.r.t. \(E\) modulo \(A\) [13]: for all \(t, t_1 \in T_\varSigma \) with \(t\rightarrow _{R/A}t_1\) there is \(t_2\in T_\varSigma \) s.t. \({ can}_{E/A}(t)\rightarrow _{R/A}t_2\) and \({ can}_{E/A}(t_1)=_A can_{E/A}(t_2)\).
The rewriting relation \(\rightarrow _\mathcal {R}\) defined by an executable rewrite theory \(\mathcal {R}\) is: \(t_1\rightarrow _\mathcal {R}t_2\) iff \( can _{E/A}(t_1)\rightarrow _{R/A}t'_2\) and \( can _{E/A}(t'_2) = t_2\). This is equivalent to \(\rightarrow _{R/(E\cup A)}\) due to confluence and coherence. We write \(t_1\!\xrightarrow {\alpha }_\mathcal {R}\!t_2\) to emphasise that \(\alpha \triangleq (l\rightarrow r\mathbf ~ \mathbf{if}~b)\,\in \,R\) is applied in the rewriting step \( can _{E/A}(t_1)\!\rightarrow _{R/A}\!t'_2\).
4 Translating Language Definitions into Rewrite Theories
This section includes the main contribution of the paper. We introduce two encodings of language definitions as rewrite theories: a faithful encoding and an approximate encoding. Since the symbolic extension of a language is also a language definition, we automatically get encodings of both concrete languages and their symbolic extensions. We investigate how the properties relating a language definition and its symbolic extension are reflected on their respective encodings.
Definition 3

\(A=\emptyset \);

for each operation \(f\) in \(\varSigma ^\mathsf {Data}\) and \(d_1,\ldots ,d_n\in \mathcal {D}\) of corresponding sorts, \(E\) includes an equation \(f(d_1,\ldots ,d_n)=\mathcal {D}_f(d_1,\dots ,d_n)\);

\(R=\mathcal {S}\), where each rule \({{\pi }\;\pmb {\wedge }\;{b}}\;\pmb {\Rightarrow }\;{r} \in \mathcal {S}\) becomes a rewrite rule \(l\rightarrow r\mathbf \,if\, b\in R\).
Theorem 3
Let \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\) be a language definition. Then \(\mathcal {R}(\mathcal {L})\) is an executable rewrite theory satisfying \(\gamma \mathop {\Longrightarrow }\limits ^{\alpha }\mathop {_{\mathcal {S}}}\limits ^{}\gamma '\) iff \(\gamma \xrightarrow {\alpha }_{\mathcal {R}(\mathcal {L})}\gamma '\), for all \(\gamma ,\gamma '\in \mathcal {T}_ Cfg \).
Remark 4
The construction of the rewrite theory \(\mathcal {R}(\mathcal {L})\), with data domain \(\mathcal {D}\subseteq \varSigma ^\mathsf {Data}\) defined by the set of equations \(E\) given in Definition 3, corresponds to the data domains \(\mathcal {D}\) being builtin sorts in the Maude terminology. A builtin sort is a sort that is not built algebraically but one that, for efficiency reasons, is directly implemented in code ( Open image in new window code in the case of Maude). For example, natural numbers are specified by the equational specification \(0:\mathsf {Nat}, s: \mathsf {Nat} \rightarrow \mathsf {Nat}\), but using the resulting unarynotation for them would be highly inefficient. This is why natural numbers are implemented as builtins. The construction \(\mathcal {R}(\mathcal {L})\) can, however, be extended to accomodate nonbuiltin sorts, i.e., sorts that are defined as the initial model of a finite set of equations \(E'\) that are confluent and terminating modulo a set \(A\) of axioms. For this, it is enough to ensure that \(E' \cup E\) is also confluent and terminating modulo \(A\)  where \(E\) is the set of equations given in the proof of Theorem 3. This typically happens, as \(E\) and \(E'\) refer to different sorts  the builtin ones for the former, and the nonbuiltin ones for the latter. If this is the case then the proof of the ground coherence property in Theorem 3 still holds, because it only depends on \(E' \cup E\) being confluent and terminating modulo \(A\), not on the particular form of the equations. The proof of faithfulness of the encoding remains the same. This observation is important, since it ensures that we obtain executable Maude rewritetheories \(\mathcal {R}(\mathcal {L})\) for languagesdefinitions \(\mathcal {L}\) whose data are specified using either bulitin sorts or nonbuiltin sorts. The faithfulness of the encoding then ensures that all results of reachability analyses (either positive or negative) performed on \(\mathcal {R}(\mathcal {L})\), e.g., obtained using Maude’s search command, also hold on \(\mathcal {L}\).
The symbolic extension of a language definition can be encoded as a rewrite theory as well. Let \({\mathcal {L}}^{\mathfrak {s}}=({\varSigma }^{\mathfrak {s}},{\mathcal {T}}^{\mathfrak {s}},{\mathcal {S}}^{\mathfrak {s}})\) be the symbolic extension of \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\). Recall that \({\varSigma }^{\mathfrak {s}}\) is \(\varSigma \) extended with the constructor of symbolic configurations \({\_}\;\pmb {\wedge }\;{\_}\) and with the symbolic values \({V}^{\mathfrak {s}}\) seen as constants. The symbolic configurations are ground terms \({\pi }\;\pmb {\wedge }\;{\phi }\in {\mathcal {T}}^{\mathfrak {s}}_{{ Cfg }^{\mathfrak {s}}}\). If \(\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})=({\varSigma }^{\mathfrak {s}},E\cup A,R)\) is the faithful encoding given by Theorem 3, then \(E=A=\emptyset \) because the data algebra \({\mathcal {D}}^{\mathfrak {s}}\) we considered is the \(\varSigma ^\mathsf{Data}({V}^{\mathfrak {s}})\)algebra of the ground terms built over \(\mathcal {D}\) and \({V}^{\mathfrak {s}}\). Recall that we assumed that \(\mathcal {D}\subseteq \varSigma \subseteq \varSigma ^\mathsf{Data}({V}^{\mathfrak {s}})\).
The relationship between a language definition \(\mathcal {L}\) and its symbolic extension \({\mathcal {L}}^{\mathfrak {s}}\) can be now reflected at the level of the encodings \(\mathcal {R}(\mathcal {L})\) and \(\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})\). A symbolic configuration \({\pi }\;\pmb {\wedge }\;{\phi }\) consists of a configuration ground term \(\pi \) (of sort \( Cfg \)) and a formula ground term \(\phi \) (of sort \( Bool \)). The constants \({V}^{\mathfrak {s}}\) play the role of logical variables, and the definition of satisfiability for patterns extends to their representations as symbolic configurations. Moreover, the notion of feasible execution in \(\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})\) is defined similarly to how it is defined for \({\mathcal {L}}^{\mathfrak {s}}\). The following two results are direct consequences of Theorems 3, 1, and 2, respectively.
Corollary 1
(Coverage for Encoding Rewrite Theories). For every concrete execution \(\gamma _0 \xrightarrow {\alpha _0}_{\mathcal {R}(\mathcal {L})} \gamma _1 \xrightarrow {\alpha _2}_{\mathcal {R}(\mathcal {L})} \cdots \xrightarrow {\alpha _n}_{\mathcal {R}(\mathcal {L})} \gamma _n \xrightarrow {\alpha _{n+1}}_{\mathcal {R}(\mathcal {L})} \cdots \) there is a symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \xrightarrow {\alpha _1}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})}{\pi _1}\;\pmb {\wedge }\;{\phi _1} \xrightarrow {\alpha _2}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \xrightarrow {\alpha _n}_{\mathcal {R}(\mathcal {L})} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \xrightarrow {\alpha _{n+1}}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
Corollary 2
(Precision for Encoding Rewrite Theories). For every feasible symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \xrightarrow {\alpha _1}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})}{\pi _1}\;\pmb {\wedge }\;{\phi _1} \xrightarrow {\alpha _2}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \xrightarrow {\alpha _n}_{\mathcal {R}(\mathcal {L})} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \xrightarrow {\alpha _{n+1}}_{\mathcal {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \) there is a concrete execution \(\gamma _0 \xrightarrow {\alpha _0}_{\mathcal {R}(\mathcal {L})} \gamma _1 \xrightarrow {\alpha _2}_{\mathcal {R}(\mathcal {L})} \cdots \xrightarrow {\alpha _n}_{\mathcal {R}(\mathcal {L})} \gamma _n \xrightarrow {\alpha _{n+1}}_{\mathcal {R}(\mathcal {L})} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).

The heating and cooling rules, which are symmetric each other, may lead to infinite rewritings;

The generated state space may be very large, even for small programs.
The former amounts to basically deriving a new definition, where the new model \(\mathcal {T}\) is the quotient of the original one, usually requiring substantial input from the user, which is something we would like to avoid.
The latter might not be suitable for language definitions in general because, semantically, it would equate elements that are supposed to be distinct in \(\mathcal {T}\). Consider a language construct randBool with two rules: randBool => true and randBool => false. Assume now we want to analyze a program which uses randBool, but who fails to satisfy a given property regardless of whether randBool transits to true or to false. In this case it might beneficial to collapse the state space by considering only one of the cases; however, if we transform the two rules above into equations, this will semantically identify true and false in \(\mathcal {T}\), collapsing much more of the state space than desirable. An additional operational concern is that transforming certain rules into equations might destroy coherence and/or confluence, thus falling out of the executability requirements.
Twolayered rewrite theories, introduced below, allow us to preserve the benefits of the techniques above (state space reduction, efficient execution), while avoiding their semantical consequences (unnecessary collapse of states in the semantical model \(\mathcal {T}\)).
Definition 4
A twolayered rewrite theory is a tuple \(\mathfrak {R}=(\varSigma ,E\cup A, 1R \cup 2R ,\varepsilon )\), where \((\varSigma ,E\cup A, 1R \cup 2R )\) is an executable rewrite theory, \(E\cup 1R \) is ground terminating modulo \(A\), and \(\varepsilon : T_{\varSigma } \rightarrow T_{\varSigma }\) is a function that, for any \(t \in T_{\varSigma }\), returns an element in the set of \((E\cup 1R )/A\)irreducible terms \(\{t' \in T_{\varSigma } \mid t \rightarrow ^!_{(E\cup 1R )/A}\, t'\}\) (which is nonempty precisely because \( E\cup 1R \) is ground terminating modulo \(A\)). The onestep rewrite relation \(\twoheadrightarrow _\mathcal {R}\) is defined by \(t_1\twoheadrightarrow _\mathcal {R}t_2\) iff \(\varepsilon (t_1) \rightarrow _{ 2R /A} t_2'\) and \(can_{E/A}(t_2')=_A t_2\).
Theorem 4
Let \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\) be a language definition and \(\mathfrak {R}(\mathcal {L})=(\varSigma ,E\cup A, 1R \cup 2R ,\varepsilon )\) be a twolayered rewrite theory with \((\varSigma ,E\cup A, 1R \cup 2R )\) built as in Definition 3 but where the set of rules is partitioned into two subsets \( 1R \) and \( 2R \) and \( E\cup 1R \) is terminating modulo \(A\). If \(\gamma \twoheadrightarrow _{\mathfrak {R}(\mathcal {L})}\gamma '\) then \(\gamma \Rightarrow _{\mathcal {S}}^+\gamma '\).
We say that \(\mathfrak {R}(\mathcal {L})\) is an approximate encoding of \(\mathcal {L}\).
Corollary 3
(precision for approximate encoding). Let \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\) be a language definition and \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})=(\varSigma ,E\cup A, 1R \cup 2R ,\varepsilon )\) be an approximate encoding of \({\mathcal {L}}^{\mathfrak {s}}\). For each feasible symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \xrightarrow {}_{{\mathcal {R}}^{\mathfrak {s}}}{\pi _1}\;\pmb {\wedge }\;{\phi _1} \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \) there is a concrete execution in \(\mathcal {L}\): \(\gamma _0 \mathop {\Longrightarrow }\limits ^{\alpha _1}\mathop {_{\mathcal {S}}}\limits ^{+} \gamma _1 \mathop {\Longrightarrow }\limits ^{\alpha _2}\mathop {_{\mathcal {S}}}\limits ^{+} \cdots \mathop {\Longrightarrow }\limits ^{\alpha _n}\mathop {_{\mathcal {S}}}\limits ^{+} \gamma _n \mathop {\Longrightarrow }\limits ^{\alpha _{n+1}}\mathop {_{\mathcal {S}}}\limits ^{+} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
An interesting and practically relevant question is whether the coverage/precision relationships between \(\mathcal {L}\) and \({\mathcal {L}}^{\mathfrak {s}}\) can be reflected on the level of the approximate encodings as twolayered rewrite theories. To investigate these relationships, we have to find a way to define an approximate twolayered rewrite theory \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\) that extends a given approximate twolayered rewrite theory \(\mathcal {R}(\mathcal {L})\). A first attempt is to define \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})=({\varSigma }^{\mathfrak {s}}, E\cup A, { 1R }^{\mathfrak {s}} \cup { 2R }^{\mathfrak {s}},{\varepsilon }^{\mathfrak {s}})\) from \(\mathcal {R}(\mathcal {L})\) in the same way \({\mathcal {L}}^{\mathfrak {s}}\) is obtained from \(\mathcal {L}\), but this is not enough to have a coveragelike result. The program log in Fig. 4 is deterministic and terminating for each \(\vartheta (A)\in Int\). So we may execute any instance of it with an approximate encoding \(\mathcal {R}\) having no secondlayer rules, i.e., \( 2R =\emptyset \). If \({ 2R }^{\mathfrak {s}}=\emptyset \), then \({ 1R }^{\mathfrak {s}}\) is non terminating because there is an infinite execution corresponding to the case when the value of the program variable X in the current configuration is always greater the zero. Another problem is to specify how the strategy \(\varepsilon \) is extended to \({\varepsilon }^{\mathfrak {s}}\). Since it is hard to give general definitions for these questions, we opted for a particular solution that can be implemented in Maude.
Definition 5
 1.
the rewrite \(t \rightarrow _{(E\cup 1R )/A}^! \varepsilon (t)\) uses the minimal rule from \( 1R \) w.r.t. \(\prec \) whenever such a rule is applicable;
 2.
if \(\alpha \) is unconditional and \(\alpha '\) is conditional then \(\alpha \prec \alpha '\).

\({ 1R }^{\mathfrak {s}} = \{ {\alpha }^{\mathfrak {s}} \mid \alpha \in 1R , \alpha ~\mathrm{unconditional }\}\);

\({ 2R }^{\mathfrak {s}} = \{ {\alpha }^{\mathfrak {s}} \mid \alpha \in 1R , \alpha ~\mathrm{conditional }\} \cup \{ {\alpha }^{\mathfrak {s}} \mid \alpha \in 2R \}\);

\({\alpha }^{\mathfrak {s}}\,{\prec }^{\mathfrak {s}}\,{\alpha '}^{\mathfrak {s}}\) iff \(\alpha \prec \alpha '\);

\({\varepsilon }^{\mathfrak {s}}\) uses the minimal rule from \({ 1R }^{\mathfrak {s}}\) w.r.t. \({\prec }^{\mathfrak {s}}\).
Theorem 5
(coverage for approximate rewrite theories). Let \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\) be a language definition and \(\mathfrak {R}(\mathcal {L})=(\varSigma ,E\cup A, 1R \cup 2R ,\varepsilon )\) be an approximate encoding of \(\mathcal {L}\). For every concrete execution \(\gamma _0 \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \gamma _1 \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \cdots \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \gamma _n \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \cdots \) there is a symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \xrightarrow {}^+_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})}{\pi _1}\;\pmb {\wedge }\;{\phi _1} \xrightarrow {}^+_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \xrightarrow {}^+_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \xrightarrow {}^+_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
However, the precision relationship between \(\mathfrak {R}(\mathcal {L})\) and \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\) does not hold in general. The reason is that \({ 1R }^{\mathfrak {s}}\) has fewer rules than \( 1R \) and hence the representativeselection strategy \({\varepsilon }^{\mathfrak {s}}\) is weaker than \(\varepsilon \). Therefore there are no guarantees that the concrete execution given by Corollary 3 will be the same with that chosen by the strategy \(\varepsilon \). If the strategy \({\varepsilon }^{\mathfrak {s}}\) is the “isomorphic image” of \(\varepsilon \) via the transformation \(\bullet \mapsto {\bullet }^{\mathfrak {s}}\), then the precision result holds:
Theorem 6
(precision for approximate rewrite theories). Let \(\mathcal {L}=(\varSigma ,\mathcal {T},\mathcal {S})\) be a language definition and \(\mathfrak {R}(\mathcal {L})=(\varSigma ,E\cup A, 1R \cup 2R ,\varepsilon )\) be an approximated encoding of \(\mathcal {L}\) such that \( 1R \) includes only unconditional rules (hence \({ 1R }^{\mathfrak {s}}=\{{\alpha }^{\mathfrak {s}}\mid \alpha \in 1R \}\)). For every feasible symbolic execution \({\pi _0}\;\pmb {\wedge }\;{\phi _0} \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})}{\pi _1}\;\pmb {\wedge }\;{\phi _1} \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} {\pi _n}\;\pmb {\wedge }\;{\phi _n} \xrightarrow {}_{\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})} \cdots \) there is a concrete one \(\gamma _0 \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \gamma _1 \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \cdots \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \gamma _n \xrightarrow {}_{\mathfrak {R}(\mathcal {L})} \cdots \) such that \(\gamma _i \in [\![{\pi _i}\;\pmb {\wedge }\;{\phi _i}]\!]\) for \(i = 0, 1, \ldots \).
5 Implementing the \(\mathbb {K}\)Framework in Maude
The current implementation of the \(\mathbb {K}\)framework uses Maude as a rewrite engine. In [4], the framework, at that time called KMaude, was presented as an extension of Maude consisting in several metatransformations which gradually translate \(\mathbb {K}\)modules into executable Maude modules. In the current version of \(\mathbb {K}\)we use a compiler for language definitions where each of these metatransformations is actually a separate compilation step. Through compilation, \(\mathbb {K}\)definitions are translated into Maude rewrite theories which are then used for running/analysing programs. The main components of a \(\mathbb {K}\)definition are the syntax declarations, the configuration and the \(\mathbb {K}\)(rewrite) rules. To these, the tool adds automatically the rules generated from strictness annotations (e.g. heating/cooling rules 1–4.
The work described in this article is concerned with how the set of rules is compiled into a twolayered rewrite theory, which is then encoded into Maude by using equations for the firstlayer rules and rewrite rules for the secondlayer rules. By default, all \(\mathbb {K}\)rules are translated into (conditional) equations, that is \( 1R =\mathcal {S}\) and \( 2R =\emptyset \). This behavior can be altered by specifying (at compile time) that certain rules are to be considered transitions, which will trigger their transformation into (conditional) rewrite rules in the resulted Maude module.
To specify that a rule is a transition, one must pass the rule name as an argument for the transition option at compilation time:
$ kompile cink.k transition "division"
The above command specifies the rule division as a transition; thus, the rule for division is included in \( 2R \). By this command we express our intent that the tool considers the rule for division as a transition when exploring an execution’s transition system. By making it a rewrite rule in Maude, we can explore the nondeterminism generated by the rule when using Maude’s search command.
Another source of nondeterminism arises from strictness annotations. When the strict attribute is given to some syntactical construct, the tool chooses by default an arbitrary, fixed order to evaluate its arguments. This optimisation has the side effect of possibly losing behaviours due to missed interleavings.
Some of these missed interleavings can be restored using the superheat option. This option is used to instruct the \(\mathbb {K}\)tool to exhaustively explore all the nondeterministic evaluation choices for the strictness of a language construct.
Once we know which rules are transitions and which are not, we can easily deduce the two sets \( 1R \) and \( 2R \), and thus we obtain the executable rewrite theory \(\mathfrak {R}(\mathcal {L})\) as discussed in Sect. 4.
The following example shows how one can explore more behaviours by specifying secondlayer rules at compile time. If we compile the language definition of CinK without any options, then running the program counter (Fig. 4) will result in a single solution, where the return value is either 1 (when the tool first evaluates dec() and then inc()) or 3 (when it first evaluates inc() and then dec()). However, if we set the operation plus as superheat:
$ kompile cink superheat "plus"
then we obtain both solutions, because the heating rule for addition can be applied in two ways and the option tells the tool to explore them both.
The symbolic transformations discussed in Sect. 3.2 are implemented as compilation steps in the \(\mathbb {K}\)compiler [2]. The tool uses the same translation to Maude discussed above in order to obtain the rewrite theory \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\). An important step in this process is that conditional rules whose conditions cannot be reduced to true are compiled as transitions, that is, they are included in \( 2R \). When performing search in Maude, these rules are essential in exploring all the execution paths, thereby ensuring the Coverage (Theorem 5) property. Note that none of the symbolic transformations applied by the tool to the language definition changes the initial semantics of the language.
The implementation uses a slightly modified version of Maude which includes a hook to the Z3 SMT solver [5] and a corresponding operation called checkSat. It receives as argument an SMTLib string, which is sent to the solver to check its satisfiability. The result returned by the solver is propagated back through the hook to Maude as a string, so checkSat can return “sat”, “unsat”, or “unknown”. In practice, our tool uses checkSat to reduce the search space by slicing unfeasible execution paths, and thus being very important in preserving the precision property. To obtain \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\) from a language definition one uses the symbolic backend as follows:
$ kompile cink backend symbolic
This command applies the symbolic transformations, moves the appropriate rules in \( 2R \), and generates the rewrite theory \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\). Using \(\mathfrak {R}({\mathcal {L}}^{\mathfrak {s}})\) one can execute programs using either concrete values or symbolic ones. However, running programs with symbolic values may lead to infinite loops when the loop conditions contain symbolic values. In such cases one can bound the number of execution paths:
$ krun log.imp search bound 3 cIN=".List" cPC="true"
This executes log (Fig. 4) symbolically, until a number of 3 solutions is found. Each solution consists in a result configuration and a formula which constitutes the path condition. The symbolic values are represented as fresh variables with a specific sort (e.g. A:Int). These can also be passed as input at the command line of the tool as arguments of the cIN parameter. Users can also set the initial path condition using the cPC option. During the symbolic execution the tool applies a rule only if the next state is feasible: the current path condition and the new conditions imposed by the application of the rule are not “unsat”.
6 Conclusion and Related Work
We presented some results that relate language definitions to different kinds of rewrite theories, which encode the language definitions both faithfully and approximately. The results show how (symbolic) analyses performed on a rewrite theory are reflected on the corresponding language definition. The general results are applied to the current implementation of \(\mathbb {K}\)language definitions in Maude.
The faitfful encoding of \(\mathbb {K}\)language definitions as rewrite theories is relatively simple but the resulting theory is not efficient in practice. Therefore we extended the notion of rewrite theory in order to work with underapproximations of the language definitions (and implicitly of the rewrite theories). The approximating theories are more efficient and flexible – the user has the freedom to work with various levels of approximations –, but heir use for program analysis must be done with care because they do not preserve all the behavioural properties. The coverage/precision results proved in this paper can help the user in correctly assessing which analyses hold on which representations.
Related Work. \(\mathbb {K}\)started as methodology for defining the semantics of the programming languages in Maude. The first tool supporting \(\mathbb {K}\)[4] was written in Maude’s metalevel, as a series of transformations translating \(\mathbb {K}\)definitions into Maude programs. Then the \(\mathbb {K}\)compiler became a more complex tool that translates a \(\mathbb {K}\)definition into an intermediate language, which is then used to generate code for various backends, including Maude. A presentation of this tool is given in [8]. There, a brief description of the semantics of \(\mathbb {K}\)definitions is also included. The programminglanguage definition framework presented here in Sect. 3 is a specialised case of that definition.
The coverage and precision properties, which relate the faithful rewritetheory encoding of a language and of that language’s symbolic version, are analoguous to the soundness and completeness results in [10], which relate usual rewriting and rewriting modulo SMT. An interesting alternative to defining symbolic execution by as executions in a transformed language (as we do it in [2]) would be to compile a language into a rewritingmoduloSMT Maude module.
Our construction of twolayered rewrite theories have some similarities with equational abstractions [9] and with the statespace reduction techniques obtained by transforming rules into equations presented in [6]. However, our firstlayer rewrite rules do not equate states as Maude equations do; their semantics is that of transformation, not of equality. Therefore these rules do not have to satisfy the executability and propertypreservation requirements of [6, 9].
Footnotes
Notes
Acknowledgement
This work was supported by the strategic grant POSDRU/159/1.5/S/137750, “Project Doctoral and Postdoctoral programs support for increased competitiveness in Exact Sciences research” cofinanced by the European Social Found within the Sectorial Operational Program Human Resources Development 2007–2013.
References
 1.Standard for Programming Language C++. Working Draft. http://www.openstd.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf
 2.Arusoaie, A., Lucanu, D., Rusu, V.: A generic framework for symbolic execution. In: Erwig, M., Paige, R.F., Van Wyk, E. (eds.) SLE 2013. LNCS, vol. 8225, pp. 281–301. Springer, Heidelberg (2013). (Also available as a technical report at http://hal.inria.fr/hal00766220/) CrossRefGoogle Scholar
 3.Clavel, M., Durán, F., Eker, S., Lincoln, P., MartíOliet, N., Meseguer, J., Talcott, C.: All About Maude  A HighPerformance Logical Framework. LNCS, vol. 4350. Springer, Heidelberg (2007)zbMATHGoogle Scholar
 4.Şerbănuţă, T.F., Roşu, G.: Kmaude: a rewriting based tool for semantics of programming languages. In: Ölveczky, P.C. (ed.) WRLA 2010. LNCS, vol. 6381, pp. 104–122. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 5.de Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)CrossRefGoogle Scholar
 6.Farzan, A., Meseguer, J.: State space reduction of rewrite theories using invisible transitions. In: Johnson, M., Vene, V. (eds.) AMAST 2006. LNCS, vol. 4019, pp. 142–157. Springer, Heidelberg (2006)CrossRefGoogle Scholar
 7.Lucanu, D., Serbanuta, T.F.: Cink  an exercise on how to think in k. Technical Report TR 12–03, Version 2, Alexandru Ioan Cuza University, Faculty of Computer Science, December 2013Google Scholar
 8.Lucanu, D., Şerbănuţă, T.F., Roşu, G.: \(\mathbb{K}\) framework distilled. In: Durán, F. (ed.) WRLA 2012. LNCS, vol. 7571, pp. 31–53. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 9.Meseguer, J., Palomino, M., MartíOliet, N.: Equational abstractions. Theor. Comput. Sci. 403(2–3), 239–264 (2008)CrossRefzbMATHGoogle Scholar
 10.Rocha, C., Meseguer, J., Munoz, C.A.: Rewriting modulo SMT. In: Escobar, S. (ed.) WRLA 2014. LNCS, vol. 8663, pp. 247–262. Springer, Heidelberg (2014)Google Scholar
 11.Roşu, G., Şerbănuţă, T.F.: An overview of the K semantic framework. J. Logic Algebraic Program. 79(6), 397–434 (2010)CrossRefzbMATHGoogle Scholar
 12.Roşu, G., Ştefănescu, A.: Checking reachability using matching logic. In: Leavens, G.T., Dwyer, M.B. (eds) OOPSLA, pp. 555–574. ACM (2012)Google Scholar
 13.Viry, P.: Equational rules for rewriting logic. Theor. Comput. Sci. 285(2), 487–517 (2002)CrossRefzbMATHMathSciNetGoogle Scholar