FunKons: Component-Based Semantics in K
Modularity has been recognised as a problematic issue of programming language semantics, and various semantic frameworks have been designed with it in mind. Reusability is another desirable feature which, although not the same as modularity, can be enabled by it. The K Framework, based on Rewriting Logic, has good modularity support, but reuse of specifications is not as well developed.
The PLanCompS project is developing a framework providing an open-ended collection of reusable components for semantic specification. Each component specifies a single fundamental programming construct, or ‘funcon’. The semantics of concrete programming language constructs is given by translating them to combinations of funcons. In this paper, we show how this component-based approach can be seamlessly integrated with the K Framework. We give a component-based definition of CinK (a small subset of C++), using K to define its translation to funcons as well as the (dynamic) semantics of the funcons themselves.
Even very different programming languages often share similar constructs. Consider OCaml’s conditional ‘ Open image in new window ’ and the conditional operator ‘\(E_1\) ? \(E_2\) : \(E_3\)’ in C. These constructs have different concrete syntax but similar semantics, with some variation in details. We would like to exploit this similarity when defining formal semantics for both languages by reusing commonalities between the OCaml and C specifications. With traditional approaches to semantics, reuse through ‘copy-paste-and-edit’ is usually the only option that is available to us. By default, this is also the case with the K Framework [9, 13]. This style of specification reuse is not systematic, and prone to error.
The semantic framework currently being developed by the PLanCompS project1 provides fundamental constructs (funcons) that address the issues of reusability in a systematic manner. Funcons are small semantic entities which express essential concepts of programming languages. These formally specified components can be composed to capture the semantics of concrete programming language constructs. A specification of Caml Light has been developed as an initial case study  and a case study on C# is in progress.
PLanCompS uses MSOS , a modular variant of structural operational semantics , to formally define individual funcons. However, the funcon approach can be seamlessly integrated with other sufficiently modular specification frameworks. We have tested the use of funcons with the K Framework by giving a specification of CinK [8, 9], a pedagogical subset of Open image in new window . We have defined both the translation of CinK to funcons and the semantics of the funcons using K’s rewrite rules. The complete prototyped specification is available online, together with the CinK test programs which we have used to test our specification.2 Interested readers may run these programs themselves using the K tool.
In this paper, we present our specification of the CinK translation (Sect. 3) and illustrate the definition of the semantics of funcons involved in it (Sect. 4). Section 5 offers an overview of related work and alternative approaches. We conclude and suggest directions of future work in Sect. 6.
2 Fundamental Constructs
As mentioned in the Introduction, the PLanCompS project is developing an open-ended collection of fundamental programming constructs, or ‘funcons’. Many funcons correspond closely to simplified programming language constructs. However, each funcon has fixed syntax and semantics. For example, the funcon written Open image in new window has the effect of evaluating Open image in new window to a variable, Open image in new window to a value (in any order), then assigning the value to the variable; it is well-typed only if Open image in new window is of type Open image in new window and Open image in new window is of type Open image in new window . In contrast, the language construct written ‘ Open image in new window ’ may be interpreted as an assignment or as an equality test (and its well-typedness changes accordingly) depending on the language.
The sort Open image in new window (commands) is for funcons (such as Open image in new window that are executed only for their effects; on normal termination, a command computes the fixed value Open image in new window .
The sort Open image in new window (declarations) is for funcons (such as Open image in new window ) that compute values of sort Open image in new window , which represent sets of bindings between identifiers and values.
One of the aims of the PLanCompS project is to establish an online repository of funcons (and data types) for anybody to use ‘off-the-shelf’ as components of language specifications. The project is currently testing the reusability of existing funcons and developing new ones in connection with some major case studies (including Caml Light, C#, and Java). Because individual funcons are meant to represent fundamental concepts in programming languages, many funcons (expressing, e.g., sequencing, conditionals, variable lookup and dereferencing) have a high potential for reuse. In fact, many funcons used in the Caml Light case study appear in the semantics of CinK presented in the following section.
The nomenclature and notation for the existing funcons are still evolving, and they will be finalised only when the case studies have been completed, in connection with the publication of the repository. Observant readers are likely to notice some (minor) differences between the funcon names used in this paper and in previous papers (e.g. ).
Regardless of the details of funcon notation, funcons can be algebraically composed to form funcon terms, according to their argument and result sorts (strictly lifted to corresponding computation sorts). Well-formedness of funcon terms is context-free: Open image in new window is a well-formed funcon term whenever Open image in new window and Open image in new window are well-formed funcon terms of sort Open image in new window . In contrast, well-typedness of funcon terms is generally context-sensitive. For example, the funcon term Open image in new window is well-typed only in the scope of a declaration that binds Open image in new window to an integer variable. Dynamic semantics is defined for all well-formed terms; execution of ill-typed terms may fail.
The composability of funcons does not depend on features such as whether they might have side effects, terminate abruptly, diverge, spawn processes, interact, etc. This is crucial for the reusability of the funcons. The semantics of each funcon has to be specified without regard to the context in which it might be used, which requires a highly modular specification framework. Funcon specifications have previously been given in MSOS, Rewriting Logic, ASF + SDF, and action notation. Here, we explore specifying funcons in K, following Roşu.3
A component-based semantics of a programming language is specified by a context-free grammar for an abstract syntax for the language, together with a family of inductively specified functions translating abstract syntax trees to funcon terms. The static and dynamic semantics of a program is given by that of the resulting funcon term. As mentioned above, funcons have fixed syntax and semantics. Thus, evolution of a language is expressed as changes to translation functions. If the syntax or semantics of the programming language changes, the definition of the translation function has to be updated to reflect this.
Tool support for translating programs to funcon terms, and for executing the static and dynamic semantics of such terms, has previously been developed in Prolog , Maude  and ASF + SDF. We now present our experiment with K, focusing on dynamic semantics.
3 A Funcon Specification of CinK
This section presents an overview of our CinK specification using funcons. We include examples from the K sources of the specification. A selection of definitions of funcons involved in the specification can be found in Sect. 4.
CinK is a pedagogical subset of Open image in new window [8, 9] used for experimentation with the K Framework. The original report  presents the language in seven iterations. The first specifies a basic imperative language; subsequent iterations extend it with threads, model-checking, references, pointers, and uni-dimensional and multi-dimensional arrays. Our specification starts with only an expression language which we extend with declarations, statements, functions, threads, references, pointers, and arrays. The extensions follow the order of the CinK iterations; however, we omit support for model-checking.
We invite the reader to compare our specification by translation to funcons with the original K specification of CinK in . Our hope is that our translation functions, together with the suggestive naming of funcons, give a rough understanding of the semantics of language constructs, even before looking at the semantics of funcons themselves.
3.1 Simple Expressions
To give semantics for expressions we use the translation function Open image in new window . It produces a funcon term (of sort Open image in new window ) which, when executed, evaluates the argument expression.
3.2 Variables, Blocks and Scope
In relation to variables, CinK (following Open image in new window ) distinguishes between two general categories of expressions: lvalue- and rvalue-expressions. We express this distinction by having different translation functions for expressions in lvalue and rvalue contexts: in addition to Open image in new window , we define Open image in new window and Open image in new window . The default function Open image in new window produces terms evaluating lvalue and rvalue expressions according to their category. When an expression is expected to evaluate to an lvalue, we use Open image in new window . When an rvalue is expected, we use Open image in new window which produces terms evaluating all expressions into rvalues. For lvalue expressions it returns the corresponding stored value, i.e., it serves as an lvalue-to-rvalue conversion.
The addition of variables also affects our translations of simple expressions and we need to update them. For example, numeric operations expect an rvalue and thus the operands are now translated using Open image in new window .
Although Caml Light and CinK are quite different languages, all the funcons we needed here so far for CinK are reused from .
3.3 Assignment and Control Statements
The funcon Open image in new window is strict in both arguments but not sequentially, so the arguments are evaluated in an unspecified order. The funcon assigns the value given as its second argument to the variable given as its first argument and returns this variable as result.
3.4 Function Definition and Calling
We represent functions as abstraction values which wrap any computation as a value. An abstraction can be passed as a parameter, bound to an identifier, or stored like any other value. To turn a funcon term into an abstraction, we use the Open image in new window value constructor. The funcon Open image in new window applies an abstraction to a value and the abstraction may refer to the passed value using Open image in new window . Multiple parameters can be passed as a tuple constructed using tuple value constructors.
Because function identifiers are already bound when the full function definition is elaborated, the full definition only assigns the abstraction to the pre-allocated variable.
The funcon Open image in new window creates a new thread in which the abstraction Open image in new window will be applied. In our case the abstraction contains a function call corresponding to the parameters given to the thread constructor.
To illustrate, consider the pointer declaration Open image in new window which declares Open image in new window to be a pointer to a pointer to an integer variable. The type of this variable in our auxiliary syntax is Open image in new window and the analysed type is Open image in new window Open image in new window .
Pointer variables are allocated in the same manner as other variables: we simply pass the type of the pointer variable as the argument to the Open image in new window funcon.
If the pointer is Open image in new window , dereferencing it or assigning to it will result in a stuck computation.
In CinK, multi-dimensional arrays are specified as vectors of vectors. As an illustration of translating array types, consider the declaration statement Open image in new window in Open image in new window . Expressing the type of Open image in new window using our auxiliary syntax gives us Open image in new window . The translated type is vectors(2, vectors(3, variables(integers))). The construct Open image in new window properly allocates variables for such multi-dimensional vectors and returns a compound value of the appropriate type.
A Note on Reuse. The complete funcon definition of CinK available online uses 27 funcons. Of these, 19 have been previously used in the specification of Caml Light and only 8 were introduced in the present work, 3 of which are just abbreviations for longer funcon terms. It is thus possible to conclude that the degree of reuse of funcons between the Caml Light and CinK specifications is high, even if the languages are quite different.
It appears that this configuration could be generated from the K rules defining the funcons used in our specification of CinK. It is unclear to us whether inference of K configurations from arbitrary K rules is possible, and whether it would be consistent with the K configuration abstraction algorithm.
3.10 Sequencing of Side Effects
Following the Open image in new window standard , CinK decouples side effects of some constructs to allow delaying memory writes to after an expression value has been returned. This gives compilers more freedom for performing optimisations and during code generation. The newest Open image in new window standard uses a relation sequenced before to define how side effects are to be ordered with respect to each other and to value evaluation. The original CinK specification in K  uses auxiliary constructs for side effects and uses a bag to collect side effects. An auxiliary sequence point construct forces finalisation of side effects in the bag.
We have experimented with funcons to express decoupled side effects and have developed a preliminary K specification of the relevant funcons. Our solution is based on a pair of funcons. The first funcon encapsulates an expression, which can potentially request to defer side effects. It also maintains a set of deferred side effects which are computed interleaved with the encapsulated expression. Finally, it ensures that all side effect computations have finished before returning the value of the original expression. The other funcon serves to defer a side effect: it signals to the encapsulating funcon that a computation is to be interleaved with the evaluation of the original expression.
4 Funcons in K
We now illustrate our K specification of the syntax and semantics of the funcons and value types used in our component-based analysis of CinK. We specify each funcon and value type in a separate module, to facilitate selective reuse. Since modularity is a significant feature of our specifications, we show some of the specified imports. The complete specifications are available online, together with the K specification of the translation of CinK programs to funcons.
We specify a corresponding funcon for conditional commands separately, since it appears that K modules cannot have parametric sorts (although the rules above could be generalised to arbitrary K arguments).
We could have included the funcon Open image in new window as an operation in the above module, since it is strict in its only expression argument:
4.5 Vector Allocation
5 Related Work
The work in this paper was inspired by a basic specification of the IMP example language in funcons using K by Roşu. IMP contains arithmetic and boolean expressions, variables, if- and while-statements, and blocks. The translation to funcons is specified directly using K rewrite rules without defining sorted translation functions. The example can be found in the stable K distribution.4
CinK, the sublanguage of Open image in new window that we use as a case study in this paper, is taken from a technical report by Lucanu and Şerbănuţă . We have limited ourselves to the same subset of Open image in new window .
SIMPLE  is another K example language which is fairly similar to CinK. The language is presented in two variants: an untyped and a typed one. The definition of typed SIMPLE uses a different syntax and only specifies static semantics. With the component-based approach, we specify a single translation of language constructs to funcons. The MSOS of the funcons defines separate relations for typing and evaluation; in K, it seems we would need to provide a separate static semantics module for each funcon, since the strictness annotations and the computation rules differ.
K specifications scale up to real-world languages, as illustrated by Ellison’s semantics of C . The PLanCompS project is currently carrying out major case studies (C#, Java) to examine how the funcon-based approach scales up to large languages, and to test the reusability of the funcon specifications.
Specification of individual language constructs in separate K modules was proposed by Hills and Roşu  and further developed by Hills [5, Chap. 5]. They obtained reusable rules by inferring the transformations needed for the rules to match the overall K configuration. The reusability of their modules was limited by their dependence on language syntax, and by the fact that the semantics of individual language constructs is generally more complicated than that of individual funcons.
We have given a component-based specification of CinK, using K to define the translation of CinK to funcons as well as the (dynamic) semantics of the funcons themselves. This experiment confirms the feasibility of integrating component-based semantics with the K Framework.
The K specification of each funcon is an independent module. Funcons are significantly simpler than constructs of languages such as CinK, and it was pleasantly straightforward to specify their K rules. However, we would have preferred the K configurations for combination of funcons to be generated automatically.
Many of the funcons used here for CinK were introduced in the component-based specification of Caml Light , demonstrating their reusability. The names of the funcons are suggestive of their intended interpretation, so the translation specification alone should convey a first impression of the CinK semantics. Readers are invited to browse the complete K specifications of our funcons online, then compare our translation of CinK to funcons with its direct specification in K .
In the future, we are aiming to define the static semantics of funcons in K, so our translation would induce a static semantics for CinK.
- 1.Chalub, F., Braga, C.: Maude MSOS tool. In: WRLA 2006, ENTCS, vol. 176, pp. 133–146. Elsevier (2007)Google Scholar
- 3.Churchill, M., Mosses, P.D., Torrini, P.: Reusable components of semantic specifications. In: Proceedings of the 13th International Conference on Modularity, MODULARITY ’14, pp. 145–156. ACM, New York (2014)Google Scholar
- 4.Ellison, C., Roşu, G.: An executable formal semantics of C with applications. In: Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’12, pp. 533–544. ACM, New York (2012)Google Scholar
- 5.Hills, M.: A Modular Rewriting Approach to Language Design, Evolution and Analysis. Ph.D. thesis, University of Illinois at Urbana-Champaign (2009)Google Scholar
- 7.ISO International Standard ISO/IEC 14882:2011(E) – Programming Language C++ (2011). http://isocpp.org/std/the-standard
- 8.Lucanu, D., Şerbănuţă, T.F.: CinK – an exercise on how to think in K. Technical report TR 12-03 (v2), Faculty of Computer Science, A. I. Cuza University, December 2013. https://fmse.info.uaic.ro/publications/181/
- 12.Roşu, G., Şerbănuţă, T.F.: K overview and SIMPLE case study. In: Proceedings of the Second International Workshop on the K Framework and Its Applications (K 2011), ENTCS, vol. 304, pp. 3–56. Elsevier (2014)Google Scholar
- 13.Şerbănuţă, T.F., Arusoaie, A., Lazar, D., Ellison, C., Lucanu, D., Roşu, G.: The K primer (v3.3). In: Proceedings of the Second International Workshop on the K Framework and its Applications (K 2011), ENTCS, vol. 304, pp. 57–80. Elsevier (2014)Google Scholar