A Functional Derivation of Small-Step Evaluators from Big-Step Counterparts

Big-step and small-step are two popular flavors of operational semantics. Big-step is often seen as a more natural transcription of informal descriptions, as well as being more convenient for some applications such as interpreter generation or optimization verification. Smallstep allows reasoning about non-terminating computations, concurrency and interactions. It is also generally preferred for reasoning about type systems. Instead of having to manually specify equivalent semantics in both styles for different applications, it would be useful to choose one and derive the other in a systematic or, preferably, automatic way. Transformations of small-step semantics into big-step have been investigated in various forms by Danvy and others. However, it appears that a corresponding transformation from big-step to small-step semantics has not had the same attention. We present a fully automated transformation that maps big-step evaluators written in direct style to their small-step counterparts. Many of the steps in the transformation, which include CPS-conversion, defunctionalisation, and various continuation manipulations, mirror those used by Danvy and his co-authors. For many standard languages, including those with either call-by-value or call-by-need and those with state, the transformation produces small-step semantics that are close in style to handwritten ones. We evaluate the applicability and correctness of the approach on 20 languages with a range of features.


Introduction
Operational semantics allow language designers to precisely and concisely specify the meaning of programs. Such semantics support formal type soundness proofs [29], give rise (sometimes automatically) to simple interpreters [15,27] and debuggers [14], and document the correct behavior for compilers. There are two popular approaches for defining operational semantics: big-step and smallstep. Big-step semantics (also referred to as natural or evaluation semantics) relate initial program configurations directly to final results in one "big" evaluation step. In contrast, small-step semantics relate intermediate configurations consisting of the term currently being evaluated and auxiliary information. The initial configuration corresponds to the entire program, and the final result, if there is one, can be obtained by taking the transitive-reflexive closure of the small-step relation. Thus, computation progresses as a series of "small steps." The two styles have different strengths and weaknesses, making them suitable for different purposes. For example, big-step semantics naturally correspond to definitional interpreters [23], meaning many big-step semantics can essentially be transliterated into a reasonably efficient interpreter in a functional language. Big-step semantics are also more convenient for verifying program optimizations and compilation -using big-step, semantic preservation can be verified (for terminating programs) by induction on the derivation [20,22].
In contrast, small-step semantics are often better suited for stepping through the evaluation of an example program, and for devising a type system and proving its soundness via the classic syntactic method using progress and preservation proofs [29]. As a result, researchers sometimes develop multiple semantic specifications and then argue for their equivalence [3,20,21]. In an ideal situation, the specifier writes down a single specification and then derives the others.
Approaches to deriving big-step semantics from a small-step variant have been investigated on multiple occasions, starting from semantics specified as either interpreters or rules [4,7,10,12,13]. An obvious question is: what about the reverse direction?
This paper presents a systematic, mechanised transformation from a big-step interpreter into its small-step counterpart. The overall transformation consists of multiple stages performed on an interpreter written in a functional programming language. For the most part, the individual transformations are well known.
The key steps in this transformation are to explicitly represent control flow as continuations, to defunctionalise these continuations to obtain a datatype of reified continuations, to "tear off" recursive calls to the interpreter, and then to return the reified continuations, which represent the rest of the computation. This process effectively produces a stepping function. The remaining work consists of finding translations from the reified continuations to equivalent terms in the source language. If such a term cannot be found, we introduce a new term constructor. These new constructors correspond to the intermediate auxiliary forms commonly found in handwritten small-step definitions.
We define the transformations on our evaluator definition language -an extension of λ-calculus with call-by-value semantics. The language is untyped and, crucially, includes tagged values (variants) and a case analysis construct for building and analysing object language terms. Our algorithm takes as input a big-step interpreter written in this language in the usual style: a main function performing case analysis on a top-level term constructor and recursively calling itself or auxiliary functions. As output, we return the resulting small-step interpreter which we can "pretty-print" as a set of small-step rules in the usual style. Hence our algorithm provides a fully automated path from a restricted class of big-step semantic specifications written as interpreters to corresponding small-step versions.
To evaluate our algorithm, we have applied it to 20 different languages with various features, including languages based on call-by-name and call-by-value λ-calculi, as well as a core imperative language. We extend these base languages with conditionals, loops, and exceptions.
We make the following contributions: -We present a multi-stage, automated transformation that maps any deterministic big-step evaluator into a small-step counterpart. Section 2 gives an overview of this process. Each stage in the transformation is performed on our evaluator definition language -an extended call-by-value λ-calculus. Each stage in the transformation is familiar and principled. Section 4 gives a detailed description. -We have implemented the transformation process in Haskell and evaluate it on a suite of 20 representative languages in Section 5. We argue that the resulting small-step evaluation rules closely mirror what one would expect from a manually written small-step specification. -We observe that the same process with minimal modifications can be used to transform a big-step semantics into its pretty-big-step [6] counterpart.

Overview
In this section, we provide an overview of the transformation steps on a simple example language. The diagram in Fig. 1 shows the transformation pipeline. As the initial step, we first convert the input big-step evaluator into continuationpassing style (CPS). We limit the conversion to the eval function itself and leave all other functions in direct style. The resulting continuations take a value as input and advance the computation. In the generalization step, we modify these continuations so that they take an arbitrary term and evaluate it to a value before continuing as before. With this modification, each continuation handles both the general non-value case and the value case itself. The next stage lifts a carefully chosen set of free variables as arguments to continuations, which allows us to define all of them at the same scope level. After generalization and argument lifting, we can invoke continuations directly to switch control, instead of passing them as arguments to the eval function. Next we defunctionalize the continuations, converting them into a set of tagged values together with an apply function capturing their meaning. This transformation enables the next step, in which we remove recursive tail-calls to apply. This allows us to interrupt the interpreter and make it return a continuation or a term: effectively, it yields a stepping function, which is the essence of a small-step semantics. The remainder of the pipeline converts continuations to terms, performs simplifications, and then converts the CPS evaluator back to direct style to obtain the final small-step interpreter. This interpreter can be pretty-printed as a set of small-step rules. Our example language is a λ-calculus with call-by-value semantics. Fig. 2 gives its syntax and big-step rules. We use environments to give meaning to variables. The only values in this language are closures, formed by packaging a λ-abstraction with an environment. We will now give a series of interpreters to illustrate the transformation process. We formally define the syntax of the meta-language in which we write these interpreters in Section 3, but we believe for readers familiar with functional programming the language is intuitive enough to not require a full explanation at this point. Shaded text highlights (often small) changes to subsequent interpreters.
Big-Step Evaluator. We start with an interpreter corresponding directly to the big-step semantics given in Fig. 2. We represent environments as functionsthe empty environment returns an error for any variable. The body of the eval function consists of a pattern match on the top-level language term. Function abstractions are evaluated to closures by packaging them with the current environment. The only term that requires recursive calls to eval is application: both its arguments are evaluated in the current environment, and then its first argument is pattern-matched against a closure, the body of which is then evaluated to a value in an extended environment using a third recursive call to eval.
Our first transformation introduces a continuation argument to eval, capturing the "rest of the computation" [9,26,28]. Instead of returning the resulting value directly, eval will pass it to the continuation. For our example we need to introduce three continuations -all of them in the case for app. The continuation kapp 1 captures what remains to be done after evaluating the first argument of app, kapp 2 captures the computation remaining after evaluating the second argument, and kclo 1 the computation remaining after the closure body is fully evaluated. This final continuation simply applies the top-level continuation to the resulting value and might seem redundant; however, its utility will become apparent in the following step. Note that the CPS conversion is limited to the eval function, leaving any other functions in the program intact.
Generalization. Next, we modify the continuation definitions so that they handle both the case when the term is a value (the original case) and the case where it is still a term that needs to be evaluated. To achieve this goal, we introduce a case analysis on the input. If the continuation's argument is a value, the evaluation will proceed as before. Otherwise it will call eval with itself as the continuation argument. Intuitively, the latter case will correspond to a congruence rule in the resulting small-step semantics and we refer to these as congruence cases in the rest of this paper.
(kclo1 e ))) } in (eval e ρ (λv . (kclo1 v ))) . . . ELSE(e1 ) → (eval e1 ρ (λe 1 . (kapp 1 e 1 ))) } in (eval e1 ρ (λv1 . (kapp 1 v1 ))) } Argument Lifting. The free variables inside each continuation can be divided into those that depend on the top-level term and those that parameterize the evaluation. The former category contains variables dependent on subterms of the top-level term, either by standing for a subterm itself, or by being derived from it. In our example, for kapp 1 , it is the variable e 2 , i.e., the right argument of app, for kapp 2 , the variable v 1 as the value resulting from evaluating the left argument, and for kclo 1 it is the environment obtained by extending the closure's environment by binding the closure variable to the operand value (ρ derived from v 2 ). We lift variables that fall into the first category, that is, variables derived from the input term. We leave variables that parametrize the evaluation, such as the input environment or the store, unlifted. The rationale is that, eventually, we want the continuations to act as term constructors and they need to carry information not contained in arguments passed to eval.

Continuations Switch Control.
Since continuations now handle the full evaluation of their argument themselves, they can be used to switch stages in the evaluation of a term. Observe how in the resulting evaluator below, the evaluation of an app term progresses through stages initiated by kapp 1 , kapp 2 , and finally kclo 1 .
Defunctionalization. In the next step, we defunctionalize continuations. For each continuation, we introduce a constructor with the corresponding number of arguments. The apply function gives the meaning of each defunctionalized continuation.
Remove Tail-Calls. We can now move from a recursive evaluator to a stepping function by modifying the continuation arguments passed to eval in congruence cases. Instead of calling apply on the defunctionalized continuation, we return the defunctionalized continuation itself. Note, that we leave intact those calls to apply that switch control between different continuations (e.g., in the definition of eval).
Convert Continuations into Terms. At this point, we have a stepping function that returns either a term or a continuation, but we want a function returning only terms. The most straightforward approach to achieving this goal would be to introduce a term constructor for each defunctionalized continuation constructor. However, many of these continuation constructors can be trivially expressed using constructors already present in the object language. We want to avoid introducing redundant terms, so we aim to reuse existing constructors as much as possible. In our example we observe that kapp1(e 2 , e 1 ) corresponds to app(e 1 , e 2 ), while kapp2(v 1 , e 2 ) to app(val(v 1 ), e 2 ). We might also observe that Our current implementation doesn't handle such cases, however, and so we introduce kclo1 as a new term constructor.
Inlining and Simplification. Next, we eliminate the apply function by inlining its applications and simplifying the result. At this point we have obtained a small-step interpreter in continuation-passing style.
Convert to Direct Style and Remove the Value Case. The final transformation is to convert our small-step interpreter back to direct style. Moreover, we also remove the value case val(v ) → val(v ) as we, usually, do not want values to step.
Small-Step Evaluator. Fig. 3 shows the small-step rules corresponding to our last interpreter. Barring the introduction of the kclo1 constructor, the resulting semantics is essentially identical to one we would write manually.

Big-Step Specifications
We define our transformations on an untyped extended λ-calculus with call-byvalue semantics that allows the straightforward definition of big-and small-step interpreters. We call this language an evaluator definition language (EDL). Table 1 gives the syntax of EDL. We choose to restrict ourselves to A-normal form, which greatly simplifies our partial CPS conversion without compromising readability. Our language has the usual call-by-value semantics, with arguments being evaluated left-to-right. All of the examples of the previous section were written in this language. Our language has 3 forms of let-binding constructs: the usual (optionally recursive) let, a let-construct for evaluator definition, and a let-construct for defining continuations. The behavior of all three constructs is the same, however, we treat them differently during the transformations. The leteval construct also comes with the additional static restriction that it may appear only once (i.e., there can be only one evaluator). The leteval and letcont forms are recursive by default, while let has an optional rec specifier to create a recursive binding. For simplicity, our language does not offer implicit mutual recursion, so mutual recursion has to be made explicit by inserting additional arguments. We do this when we generate the apply function during defunctionalization.

Evaluator Definition Language
Notation and Presentation. We use vector notation to denote syntactic lists belonging to a particular sort. For example, e and ae are lists of elements of, respectively, Expr and AExpr , while x is a list of variables. Separators can be spaces (e.g., function arguments) or commas (e.g., constructor arguments or configuration components). We expect the actual separator to be clear from the context. Similarly for lists of expressions: e, ae, etc. In let bindings, f x 1 . . . x n = e and f = λx 1 . . . x n . e are both syntactic sugar for f = λx 1 . . . . λx n . e. Expr e ::= let bn = ce in e (let-binding) | let rec bn = ce in e (recursive let-binding)

Transformation Steps
In this section, we formally define each of the transformation steps informally described in Section 2. For each transformation function, we list only the most relevant cases; the remaining cases trivially recurse on the A-normal form (ANF) abstract syntax. We annotate functions with E, CE , and AE to indicate the corresponding ANF syntactic classes. We omit annotations when a function only operates on a single syntactic class. For readability, we annotate meta-variables to hint at their intended use -ρ stands for read-only entities (such as environments), whereas σ stands for read-write or "state-like" entities of a configuration (e.g., stores or exception states). These can be mixed with our notation for syntactic lists, so, for example, x σ is a sequence of variables referring to state-like entities, while ae ρ is a sequence of a-expressions corresponding to read-only entities.

CPS Conversion
The first stage of the process is a partial CPS conversion [8,25] to make control flow in the evaluator explicit. We limit this transformation to the main evaluator function, i.e., only the function eval will take an additional continuation argument and will pass results to it. Because our input language is already in ANF, the conversion is relatively easy to express. In particular, applications of the evaluator are always let-bound to a variable (or appear in a tail position), which makes constructing the current continuation straightforward. Below are the relevant clauses of the conversion. For this transformation we assume the following easily checkable properties: -The evaluator name is globally unique.
-The evaluator is never applied partially.
-All bound variables are distinct.
The conversion is defined as three mutually recursive functions with the following signatures: In the above equations, let' is a pseudo-construct used to make renormalization more readable. In essence, it is a non-ANF version of let where the bound expression is generalized to Expr . Note that renorm only works correctly if x ∈ fv(e), which is implied by our assumption that all bound variables are distinct.

Generalization of Continuations
The continuations resulting from the above CPS conversion expect to be applied to value terms. The next step is to generalize (or "lift") the continuations so that they recursively call the evaluator to evaluate non-value arguments. In other words, assuming the term type can be factored into values and computations V + C, we convert each continuation k with the type V → V into a continuation k : V + C → V using the following schema: The recursive clauses will correspond to congruence rules in the resulting smallstep semantics.
The transformation works by finding the unique application site of the continuation and then inserting the corresponding call to eval in the non-value case.
where -findApp k e is the unique use site of the continuation k in expression e, that is, the CExpr where eval is applied with k as its continuation; and -x is a fresh variable associated with x -it stands for "a term corresponding to (the value) x".
Following the CPS conversion, each named continuation is applied exactly once in e, so findApp k e is total and returns the continuation's unique use site. Moreover, because the continuation was originally defined and let-bound at that use site, all free variables in findApp k e are also free in the definition of k.
When performing this generalization transformation, we also modify tail positions in eval that return a value so that they wrap their result in the val constructor. That is, if the continuation parameter of eval is k, then we rewrite all sites applying k to a configuration as follows: k ae, ae σ ⇒ k val(ae), ae σ

Argument Lifting in Continuations
In the next phase, we partially lift free variables in continuations to make them explicit arguments. We perform a selective lifting in that we avoid lifting nonterm arguments to the evaluation function. These arguments represent entities that parameterize the evaluation of a term. If an entity is modified during evaluation, the modified entity variable gets lifted. In the running example of Section 2, such a lifting occurred for kclo 1 .
Function lift specifies the transformation at the continuation definition site: and at the continuation application site -recall that continuations are always applied fully, but at this point they are only applied to one argument: Our lifting function is a restricted version of a standard argument-lifting algorithm [19]. The first restriction is that we do not lift all free variables, since we do not aim to float and lift the continuations to the top-level of the program, only to the top-level of the evaluation function. The other difference is that we can use a simpler way to compute the set of lifted parameters due to the absence of mutual recursion between continuations. The correctness of this can be proved using the approach of Fischbach [16].

Continuations Switch Control Directly
At this point, continuations handle the full evaluation of a term themselves. Instead of calling eval with the continuation as an argument, we can call the continuation directly to switch control between evaluation stages of a term. We will replace original eval call sites with direct applications of the corresponding continuations. The recursive call to eval in congruence cases of continuations will be left untouched, as this is where the continuation's argument will be evaluated to a value. Following from the continuation generalization transformation, this call to eval is with the same arguments as in the original site (which we are now replacing). In particular, the eval is invoked with the same ae ρ arguments in the continuation body as in the original call site.

Defunctionalization
Now we can move towards a first-order representation of continuations which can be further converted into term constructions. We defunctionalize continuations by first collecting all continuations in eval, then introducing corresponding constructors (the syntax), and finally generating an apply function (the semantics). The collection function accumulates continuation names and their definitions. At the same time it removes the definitions.
where (K ce , ce ) = collect CE ce We reuse continuation names for constructors. The apply function is generated by simply generating a case analysis on the constructors and reusing the argument names from the continuation function arguments. In addition to the defunctionalized continuations, the generated apply function will take the same arguments as eval. Because of the absence of mutual recursion in our meta-language, apply takes eval as an argument. p 1,1 , . . . , p 1,i ) → e 1 ; . . . ; k n (p n,1 , . . . , p n,j ) → e n } Now we need a way to replace calls to continuations with corresponding calls to apply. For ae ρ and k top we use the arguments passed to eval or apply (depending on where we are replacing).
replace CE k ae k ae, ae σ ( x ρ , k top ) = apply eval k( ae k , ae), ae σ x ρ k top Finally, the complete defunctionalization is defined in terms of the above three functions.

Remove Self-recursive Tail-Calls
This is the transformation which converts a recursive evaluator into a stepping function. The transformation itself is very simple: we simply replace the selfrecursive calls to apply in congruence cases. Note, that we still leave those invocations of apply that serve to switch control through the stages of evaluation. Unless a continuation constructor will become a part of the output language, its application will be inlined in the final phase of our transformation.

Convert Continuations to Terms
After defunctionalization, we effectively have two sorts of terms: those constructed using the original constructors and those constructed using continuation constructors. Terms in these two sorts are given their semantics by the eval and apply functions, respectively. To get only one evaluator function at the end of our transformation process, we will join these two sorts, adding extra continuation constructors as new term constructors. We could simply merge apply to eval, however, this would give us many overlapping constructors. For example, in Section 2, we established that kapp1(e 2 , e 1 ) ≈ app(e 1 , e 2 ) and kapp2(v 1 , e 2 ) ≈ app(val(v 1 ), e 2 ). The inference of equivalent term constructors is guided by the following simple principle. For each continuation term c k (ae 1 , . . . , ae n ) we are looking for a term c (ae 1 , . . . , ae m ), such that, for all ae σ , ae ρ and ae k apply eval c k (ae 1 , . . . , ae n ), ae σ ae ρ ae k = eval c (ae 1 , . . . , ae m ), ae σ ae ρ ae k In our current implementation, we use a conservative approach where, starting from the cases in eval, we search for continuations reachable along a control flow path. Variables appearing in the original term are instantiated along the way. Moreover, we collect variables dependent on configuration entities (state). If control flow is split based on information derived from the state, we automatically include any continuation constructors reachable from that point as new constructors in the resulting language and interpreter. This, together with how information flows from the top-level term to subterms in congruence cases, preserves the coupling between state and corresponding subterms between steps.
If, starting from an input term c( x), an invocation of apply on a continuation term c k ( ae k ) is reached, and if, after instantiating the variables in the input term c( ae), the sets of their free variables are equal, then we can introduce a translation from c k ( ae k ) into c( ae). If such a direct path is not found, the c k will become a new term constructor in the language and a case in eval is introduced such that the above equation is satisfied.

Inlining, Simplification and Conversion to Direct Style
To finalize the generation of a small-step interpreter, we inline all invocations of apply and simplify the final program. After this, the interpreter will consists of only the eval function, still in continuation-passing style. To convert the interpreter to direct style, we simply substitute eval's continuation variable for (λx.x) and reduce the new redexes. Then we remove the continuation argument performing rewrites following the scheme: eval ae (λbn. e) ⇒ let bn = eval ae in e Finally, we remove the reflexive case on values (i. e., val(v) → val(v)). At this point we have a small-step interpreter in direct form.

Removing Vacuous Continuations
After performing the above transformation steps, we may end up with some redundant term constructors, which we call "empty" or vacuous. These are constructors which only have one argument and their semantics is equivalent to the argument itself, save for an extra step which returns the computed value. In other words, they are unary constructs which only have two rules in the resulting small-step semantics matching the following pattern.
Such a construct will result from a continuation, which, even after generalization and argument lifting, merely evaluates its sole argument and returns the corresponding value: These continuations can be easily identified and removed once argument lifting is performed, or at any point in the transformation pipeline, up until apply is absorbed into eval.

Detour: Generating Pretty-Big-Step Semantics
It is interesting to see what kind of semantics we get by rearranging or removing some steps of the above process. If, after CPS conversion, we do not generalize the continuations, but instead just lift their arguments and defunctionalize them, 1 we obtain a pretty-big-step [6] interpreter. The distinguishing feature of pretty-big-step semantics is that constructs which would normally have rules with multiple premises are factorized into intermediate constructs. As observed by Charguéraud, each intermediate construct corresponds to an intermediate state of the interpreter, which is why, in turn, they naturally correspond to continuations. Here are the pretty-big-step rules generated from the big-step semantics in Fig. 2 (Section 2). ρ kapp2(clo(x , e , ρ )

Pretty-Printing
For the purpose of presenting and studying the original and transformed semantics, we add a final pretty-printing phase. This amounts to generating inference rules corresponding to the control flow in the interpreter. This pretty-printing stage can be applied to both the big-step and small-step interpreters and was used to generate many of the rules in this paper, as well as for generating the appendix of the full version of this paper [1].

Correctness
A correctness proof for the full pipeline is not part of our current work. However, several of these steps (partial CPS conversion, partial argument lifting, defunctionalization, conversion to direct style) are instances of well-established techniques. In other cases, such as generalization of continuations (Section 4.2) and removal of self-recursive tail-calls (Section 4.6), we have informal proofs using equational reasoning [1]. The proof for tail-call removal is currently restricted to compositional interpreters.

Evaluation
We have evaluated our approach to deriving small-step interpreters on a range of example languages. Table 2 presents an overview of example big-step specifications and their properties, together with their derived small-step counterparts. A full listing of the input and output specifications for these case studies appears in the appendix to the full version of the paper, which is available online [1]. For our case studies, we have used call-by-value and call-by-name λ-calculi, and a simple imperative language as base languages and extended them with some common features. Overall, the small-step specifications (as well as the corresponding interpreters) resulting from our transformation are very similar to ones we could find in the literature. The differences are either well justified-for example, by different handling of value terms-or they are due to new term constructors which could be potentially eliminated by a more powerful translation.
We evaluated the correctness of our transformation experimentally, by comparing runs of the original big-step and the transformed small-step interpreters, as well as by inspecting the interpreters themselves. In a few cases, we proved the transformation correct by transcribing the input and output interpreters in Coq (as an evaluation relation coupled with a proof of determinism) and proving them equivalent. From the examples in Table 2, we have done so for "Call-byvalue", "Exceptions as state", and a simplified version of "CBV, exceptions as state".
We make a few observations about the resulting semantics here.
New Auxiliary Constructs. In languages that use an environment to look up values bound to variables, new constructs are introduced to keep the updated environment as context. These constructs are simple: they have two argumentsone for the environment (context) and one for the term to be evaluated in that environment. A congruence rule will ensure steps of the term argument in the given context and another rule will return the result. The construct kclo1 from the λ-calculus based examples is a typical example.
As observed in Section 2, if the environment ρ is a result of updating an environment ρ with a binding of x to v, then the app rule ρ = update x v ρ ρ app(clo(ρ , x, e), v) − → kclo1(ρ , e) and the above two rules can be replaced with the following rules for app: Another common type of constructs resulting in a recurring pattern of extra auxiliary constructs are loops. For example, the "While" language listed in Table 2 contains a while-loop with the following big-step rules: The automatic transformation of these rules introduces two extra constructs, kwhile1 and ktrue1. The former ensures the full evaluation of the condition expression, keeping a copy of it together with the while's body. The latter construct ensures the full evaluation of while's body, keeping a copy of the body together with the condition expression.
We observe that in a language with a conditional and a sequencing construct we can find terms corresponding to kwhile1 and ktrue1: seq(c, while(e b , c)), skip) The small-step semantics of while could then be simplified to a single rule. seq(c, while(e b , c)), skip), σ Our current, straightforward way of deriving term-continuation equivalents is not capable of finding these equivalences. In future work, we want to explore external tools, such as SMT solvers, to facilitate searching for translations from continuations to terms. This search could be possibly limited to a specific term depth.
Exceptions as Values. We tested our transformations with two ways of representing exceptions in big-step semantics currently supported by our input language: as values and as state. Representing exceptions as values appears to be more common and is used, for example, in the big-step specification of Standard ML [24], or in [6] in connection with pretty big-step semantics. Given a big-step specification (or interpreter) in this style, the generated small-step semantics handles exceptions correctly (based on our experiments). However, since exceptions are just values, propagation to top-level is spread out across multiple steps -depending on the depth of the term which raised the exception. The following example illustrates this behavior. add(1, add(2, add(raise(3), raise(4)))) − → add(1, add(2, add(exc(3), raise(4)))) − → add(1, add(2, exc(3))) − → add(1, exc(3)) − → exc (3) Since we expect the input semantics to be deterministic and the propagation of exceptions in the resulting small-step follows the original big-step semantics, this "slow" propagation is not a problem, even if it does not take advantage of "fast" propagation via labels or state. A possible solution we are considering for future work is to let the user flag values in the big-step semantics and translate such values as labels on arrows or a state change to allow propagating them in a single step.
Exceptions as State. Another approach to specifying exceptions is to use a flag in the configuration. Rules may be specified so that they only apply if the incoming state has no exception indicated. As with the exceptions-as-values approach, propagation rules have to be written to terminate a computation early if a computation of a subterm indicates an exception. Observe the exception propagation rule for add and the exception handling rule for try. Using state to propagate exceptions is mentioned in connection with smallstep SOS in [4]. While this approach has the potential advantage of manifesting the currently raised exception immediately at the top-level, it also poses a problem of locality. If an exception is reinserted into the configuration, it might become decoupled from the original site. This can result, for example, in the wrong handler catching the exception in a following step. Our transformation deals with this style of exceptions naturally by preserving more continuations in the final interpreter. After being raised, an exception is inserted into the state and propagated to top-level by congruence rules. However, it will only be caught after the corresponding subterm has been evaluated, or rather, a value has been propagated upwards to signal a completed computation. This behavior corresponds to exception handling in big-step rules, only it is spread out over multiple steps. Continuations are kept in the final language to correspond to stages of computation and thus, to preserve the locality of a raised exception. A handler will only handle an exception once the raising subterm has become a value. Hence, the exception will be intercepted by the innermost handler -even if the exception is visible at the top-level of a step.
Based on our experiments, the exception-as-state handling in the generated small-step interpreters is a truthful unfolding of the big-step evaluation process. This is further supported by our ad-hoc proofs of equivalence between input and output interpreters. However, the generated semantics suffers from a blowup in the number of rules and moves away from the usual small-step propagation and exception handling in congruence rules. We see this as a shortcoming of the transformation. To overcome this, we briefly experimented with a case-floating stage, which would result in catching exceptions in the congruence cases of continuations. Using such transformation, the resulting interpreter would more closely mirror the standard small-step treatment of exceptions as signals. However, the conditions when this transformations should be triggered need to be considered carefully and we leave this for future work.
Limited Non-determinism. In the present work, our aim was to only consider deterministic semantics implemented as an interpreter in a functional programming language. However, since cases of the interpreter are considered independently in the transformation, some forms of non-determinism in the input semantics get translated correctly. For example, the following internal choice construct (cf. CSP's operator [5,17]) gets transformed correctly. The straightforward big-step rules are transformed into small-step rules as expected. Of course, one has to keep in mind that these rules are interpreted as ordered, that is, the first rule in both styles will always apply.

Related Work
In their short paper [18], the authors propose a direct syntactic way of deriving small-step rules from big-step ones. Unlike our approach, based on manipulating control flow in an interpreter, their transformation applies to a set of inference rules. While axioms are copied over directly, for conditional rules a stack is added to the configuration to keep track of evaluation. For each conditional bigstep rule, an auxiliary construct and 4 small-step rules are generated. Results of "premise computations" are accumulated and side-conditions are only discharged at the end of such a computation sequence. For this reason, we can view the resulting semantics more as a "leap" semantics, which makes it less suitable for a semantics-based interpreter or debugger. A further disadvantage is that the resulting semantics is far removed from a typical small-step specification with a higher potential for blow-up as 4 rules are introduced for each conditional rule. On the other hand, the delayed unification of meta-variables and discharging of side-conditions potentially makes the transformation applicable to a wider array of languages, including those where control flow is not as explicit.
In [2], the author explores an approach to constructing abstract machines from big-step (natural) specifications. It applies to a class of big-step specifications called L-attributed big-step semantics, which allows for sufficiently interesting languages. The extracted abstract machines use a stack of evaluation contexts to keep track of the stages of computations. In contrast, our transformed interpreters rebuild the context via congruence rules in each step. While this is less efficient as a computation strategy, the intermediate results of the computation are visible in the context of the original program, in line with usual SOS specifications.
A significant body of work has been developed on transformations that take a form of small-step semantics (usually an interpreter) and produce a big-stepstyle interpreter. The relation between semantic specifications, interpreters and abstract machines has been thoroughly investigated, mainly in the context of reduction semantics [10][11][12][13]26]. In particular, our work was inspired by and is based on Danvy's work on refocusing in reduction semantics [13] and on use of CPS conversion and defunctionalization to convert between representations of control in interpreters [11].
A more direct approach to deriving big-step semantics from small-step is taken by authors of [4], where a small-step Modular SOS specification is transformed into a pretty-big-step one. This is done by introducing reflexivity and transitivity rules into a specification, along with a "refocus" rule which effectively compresses a transition sequence into a single step. The original small-step rules are then specialized with respect to these new rules, yielding refocused rules in the style of pretty-big-step semantics [6]. A related approach is by Ciobâcȃ [7], where big-step rules are generated for a small-step semantics. The big-step rules are, again, close to a pretty-big-step style.