HOBiT: Programming Lenses Without Using Lens Combinators
 4 Citations
 8.2k Downloads
Abstract
We propose HOBiT, a higherorder bidirectional programming language, in which users can write bidirectional programs in the familiar style of conventional functional programming, while enjoying the full expressiveness of lenses. A bidirectional transformation, or a lens, is a pair of mappings between source and view data objects, one in each direction. When the view is modified, the source is updated accordingly with respect to some laws—a pattern that is found in databases, modeldriven development, compiler construction, and so on. The most common way of programming lenses is with lens combinators, which are lenstolens functions that compose simpler lenses to form more complex ones. Lens combinators preserve the bidirectionality of lenses and are expressive; but they compel programmers to a specialised pointfree style—i.e., no naming of intermediate computation results—limiting the scalability of bidirectional programming. To address this issue, we propose a new bidirectional programming language HOBiT, in which lenses are represented as standard functions, and combinators are mapped to language constructs with binders. This design transforms bidirectional programming, enabling programmers to write bidirectional programs in a flexible functional style and at the same time access the full expressiveness of lenses. We formally define the syntax, type system, and the semantics of the language, and then show that programs in HOBiT satisfy bidirectionality. Additionally, we demonstrate HOBiT ’s programmability with examples.
1 Introduction
Transforming data from one format to another is a common task of programming: compilers transform program texts into syntax trees, manipulate the trees and then generate lowlevel code; database queries transform base relations into views; model transformations generate lowerlevel implementations from higherlevel models; and so on. Very often, such transformations will benefit from being bidirectional, allowing changes to the targets to be mapped back to the sources too. For example, if one can run a compiler frontend (preprocessing, parsing, desugaring, etc.) backwards, then all sorts of program analysis tools will be able to focus on a much smaller core language, without sacrificing usability, as their outputs in term of the core language will be transformed backwards to the source language. In the same way, such needs arise in databases (the viewupdate problem [1, 6, 12]) and modeldriven engineering (bidirectional model transformation) [28, 33, 35].
The most common way of programming lenses is with lens combinators [3, 7, 8], which are basically a selection of lenstolens functions that compose simpler lenses to form more complex ones. This combinatorbased approach follows the long history of lightweight language development in functional programming. The distinctive advantage of this approach is that by restricting the lens language to a few selected combinators, wellbehavedness can be more easily preserved in programming, and therefore given wellbehaved lenses as inputs, the combinators are guaranteed to produce wellbehaved lenses. This idea of lens combinators is very influential academically, and various designs and implementations have been proposed [2, 3, 7, 8, 9, 16, 17, 27, 32] over the years.
1.1 The Challenge of Programmability
The complexity of a piece of software can be classified as either intrinsic or accidental. Intrinsic complexity reflects the inherent difficulty of the problem at hand, whereas accidental complexity arises from the particular programming language, design or tools used to implement the solution. This work aims at reducing the accidental complexity of bidirectional programming by contributing to the design of bidirectional languages. In particularly, we identify a language restriction—i.e., no naming of intermediate computation results—which complicates lens programming, and propose a new design that removes it.
As a teaser to demonstrate the problem, let us consider the list append function. In standard unidirectional programming, it can be defined simply as \( append \;x \;y = \mathbf {case}~x~\mathbf {of}~\{ [\,]\rightarrow y; a:x' \rightarrow a: append \;x' \;y\}\). Astute readers may have already noticed that \( append \) is defined by structural recursion on x, which can be made explicit by using \( foldr \) as in \( append \;x \;y = foldr \;(:) \;y \;x\).
It is beyond the scope of this paper to explain how exactly the definition of \( appendL \) works, as its obscurity is what this work aims to remove. Instead, we informally describe its behaviour and the various components of the code. The above code defines a lens: forwards, it behaves as the standard \( append \), and backwards, it splits the updated view list, and when the length of the list changes, this definition implements (with the Open image in new window part) the bias of keeping the length of the first source list whenever possible (to disambiguate multiple candidate source changes). Here, \( cond \), \((\mathbin {\hat{\circ }})\), etc. are lens combinators and \( outListL \) and \( rearr \) are auxiliary lenses, as can be seen from their types. Unlike its unidirectional counterpart, \( appendL \) can no longer be defined as a structural recursion on list; instead it traverses a pair of lists with rather complex rearrangement \( rearr \).
Intuitively, the additional Open image in new window parts is intrinsic complexity, as they are needed for directing backwards execution. However, the complicated recursion scheme, which is a direct result of the underlying limitation of lens languages, is certainly accidental. Recall that in the definition of \( append \), we were able to use the variable \( y \), which is bound outside of the recursion pattern, inside the body of \( foldr \). But the same is not possible with lens combinators which are strictly ‘pointfree’. Moreover, even if one could name such variables (points), their usage with lens combinators will be very restricted in order to guarantee wellbehavedness [21, 23]. This problem is specific to opaque nonfunction objects such as lenses, and goes well beyond the traditional issues associated with the pointfree programming style.
As expected, the above code shares the Open image in new window part with the definition of \( appendL \) as the two implement the same backwards behaviour. The difference is that \( appendB \) uses structural recursion in the same way as the standard unidirectional \( append \), greatly simplifying programming. This is made possible by the HOBiT ’s type system and semantics, allowing unrestricted use of free variables. This difference in approach is also reflected in the types: \( appendB \) is a proper function (instead of the abstract lens type of \( appendL \)), which readily lends itself to conventional functional programming. At the same time, \( appendB \) is also a proper lens, which when executed by the HOBiT interpreter behave exactly like \( appendL \). A major technical challenge in the design of HOBiT is to guarantee this duality, so that functions like \( appendB \) are wellbehaved by construction despite the flexibility in their construction.
1.2 Contributions
 1.
It supports the conventional programming style that is used in unidirectional programming. As a result, a program in HOBiT can be defined in a way similar to how one would define only its \( get \) component. For example, \( appendB \) is defined in the same way as the unidirectional \( append \).
 2.
It supports incremental improvement. Given the very often close resemblance of a bidirectionalprogram definition and that of its get component, it becomes possible to write an initial version of a bidirectional program almost identical to its \( get \) component and then to adjust the backwards behaviour gradually, without having to significantly restructure the existing definition.
Thanks to these distinctive advantages, HOBiT for the first time allows us to construct realisticallysized bidirectional programs with relative ease. Of course, this does not mean free lunch: the ability to control backwards behaviours will not magically come without additional code (for example the Open image in new window part above). What HOBiT achieves is that programming effort may now focus on the productive part of specifying backwards behaviours, instead of being consumed by circumventing language restrictions.
In summary, we make the following contributions in this paper.

We design a higherorder bidirectional programming language HOBiT, which supports convenient bidirectional programming with control of backwards behaviours (Sect. 3). We also discuss several extensions to the language (Sect. 5).

We present the semantics of HOBiT inspired by the idea of staging [5], and prove the wellbehavedness property using Kripke logical relations [18] (Sect. 4).

We demonstrate the programmability of HOBiT with examples such as desugaring/resugaring [26] (Sect. 6). Additional examples including a bidirectional evaluator for \(\lambda \)calculus [21, 23], a parser/printer for Sexpressions, and bookmark extraction for Netscape [7] can be found at https://bitbucket.org/kztk/hibx together with a prototype implementation of HOBiT.
2 Overview: Bidirectional Programming Without Combinators
In this section, we informally introduce the essential constructs of HOBiT and demonstrate their use by a few small examples. Recall that, as seen in the \( appendB \) example, the strength of HOBiT lies in allowing programmers to access \(\lambda \)abstractions without restrictions on the use of \(\lambda \)bound variables.
2.1 The \(\underline{\mathbf {case}}\) Construct
The most important language construct in HOBiT is \(\underline{\mathbf {case}}\) (pronounced as bidirectional case), which provides pattern matching and easy access to bidirectional branching, and also importantly, allows unrestricted use of \(\lambda \)bound variables.
The pattern matching part of \(\underline{\mathbf {case}}\) performs two implicit operations: it first unwraps the \(\varvec{\mathsf {B}}{}\)typed value, exposing its content for normal pattern matching, and then it wraps the variables bound by the pattern matching, turning them into ‘updatable’ \(\varvec{\mathsf {B}}{}\)typed values to be used in the bodies. For example, in the second branch of \( appendB \), a and \(x'\) can be seen as having types A and [A] in the pattern, but \(\varvec{\mathsf {B}}{A}\) and \(\varvec{\mathsf {B}}{[A]}\) types in the body; and the bidirectional constructor \((\mathbin {\underline{:}}) \,{:}{:}\, \varvec{\mathsf {B}}{A} \rightarrow \varvec{\mathsf {B}}{[A]} \rightarrow \varvec{\mathsf {B}}{[A]}\) combines them to produce a \(\varvec{\mathsf {B}}{}\)typed list.
In addition to the standard conditional branches, \(\underline{\mathbf {case}}\)expression has two unique components \(\phi _i\) and \(\rho _i\) called exit conditions and reconciliation functions respectively, which are used in backwards executions. Exit condition \(\phi _i\) is an overapproximation of the forwardsexecution results of the expressions \(e_i\). In other words, if branch i is choosen, then \(\phi _i\;e_i\) must evaluate to \(\mathsf {True}\). This assertion is checked dynamically in HOBiT, though could be checked statically with a sophisticated type system [7]. In the backwards direction the exit condition is used for deciding branching: the branch with its exit condition satisfied by the updated view (when more than one match, the original branch used in the forwards direction has higher priority) will be picked for execution. The idea is that due to the update in the view, the branch taken in the backwards direction may be different from the one taken in the original forwards execution, a feature that is commonly supported by lens languages [7] which we call branch switching.
Branch switching is crucial to \( put \)’s robustness, i.e., the ability to handle a wide range of view updates (including those affect the branching decisions) without failing. We explain its working in details in the following.
Branch Switching. Being able to choose a different branch in the backwards direction only solves part of the problem. Let us consider the case where a forward execution chooses the \(n^\mathrm {th}\) branch, and the backwards execution, based on the updated view, chooses the \(m^\mathrm {th}\) (\(m \ne n\)) branch. In this case, the original value of the patternmatched expression e, which is the reason for the \(n^\mathrm {th}\) branch being chosen, is not compatible with the \( put \) of the \(m^\mathrm {th}\) branch.
As we have explained above, exit conditions are used to decide which branch will be used in the backwards direction. For the first and second evaluations of \( put \), the exit conditions corresponding to the original branches were true for the updated view. For the last evaluation of \( put \), since the exit condition of the original branch was false but that of the other branch was true, branch switching is required here. However, a direct \( put \)execution of f with the inputs \((\mathsf {Right} \;(1,[2,3]))\) and \([\,]\) crashes (represented by \(\bot \) above), for a good reason, as the two inputs are in an inconsistent state with respect to f.
The exit condition for the nil case always returns true as there is no restriction on the value of \( y \), and for the cons case it requires the returned list to be nonempty. In the backwards direction, when the updated view is nonempty, both exit conditions will be true, and then the original branch will be taken. This means that since \( appendB \) is defined as a recursion on x, the backwards execution will try to unroll the original recursion step by step (i.e., the cons branch will be taken for a number of times that is the same as the length of \( x \)) as long as the view remains nonempty. If an updated view list is shorter than \( x \), then \( not \circ null \) will become false before the unrolling finishes, and the nil branch will be taken (branchswitching) and the reconciliation function will be called.
Difference from Lens Combinators. As mentioned above, the idea of branch switching can be traced back to lens languages. In particular, the design of \(\underline{\mathbf {case}}\) is inspired by the combinator \( cond \) [7]. Despite the similarities, it is important to recognise that \(\underline{\mathbf {case}}\) is not only a more convenient syntax for \( cond \), but also crucially supports the unrestricted use of \(\lambda \)bound variables. This more fundamental difference is the reason why we could define \( appendB \) in the conventional functional style as the variables \( x \) and \( y \) are used freely in the body of \(\underline{\mathbf {case}}\). In other words, the novelty of HOBiT is its ability to combine the traditional (higherorder) functional programming and the bidirectional constructs as found in lens combinators, effectively establishing a new way of bidirectional programming.
2.2 A More Elaborate Example: \( linesB \)
This behaviour is achieved by the definition in Fig. 2, which makes good use of reconciliation functions. Note that we do not consider the contrived corner case where the string ends with duplicated newlines such as in Open image in new window . The function \( breakNLB \) splits a string at the first newline; since \( breakNLB \) is injective, its exit conditions and reconciliation functions are of little interest. The interesting part is in the definition of \( linesB \), particularly its use of reconciliation functions to track the existence of a last newline character. We firstly explain the branching structure of the program. On the top level, when the first line is removed from the input, the remaining string b may contain more lines, or be the end (represented by either the empty list or the singleton list Open image in new window ). If the first branch is taken, the returned result will be a list of more than one element. In the second branch when it is the end of the text, b could contain a newline or simply be empty. We do not explicitly give patterns for the two cases as they have the same body \(f \mathbin {\underline{:}}\underline{[\,]}\), but the reconciliation function distinguishes the two in order to preserve the original source structure in the backwards execution. Note that we intentionally use the same variable name b in the case analysis and the reconciliation function, to signify that the two represent the same source data. The use of argument b in the reconciliation functions serves the purpose of remembering the (non)existence of the last newline in the original source, which is then preserved in the new source.
It is worth noting that just like the other examples we have seen, this definition in HOBiT shares a similar structure with a definition of \( lines \) in Haskell.^{1} The notable difference is that a Haskell definition is likely to have a different grouping of the three cases of \( lines \) into two branches, as there is no need to keep track of the last newline for backwards execution. Recall that reconciliation functions are called after branches are chosen by exit conditions; in the case of \( linesB \), the reconciliation function is used to decide the reconciled value of \(b'\) to be Open image in new window or Open image in new window . This, however, means that we cannot separate the pattern \(b'\) into two Open image in new window and Open image in new window with copying its branch body and exit condition, because then we lose a chance to choose a reconciled value of b based on its original value.
3 Syntax and Type System of HOBiT Core
In this section, we describe the syntax and the type system of the core of HOBiT.
3.1 Syntax
Although in examples we used \(\mathbf {case}\)/\(\underline{\mathbf {case}}\)expressions with an arbitrary number of branches having overlapping patterns under the firstmatch principle, we assume for simplicity that in HOBiT Core \(\mathbf {case}\)/\(\underline{\mathbf {case}}\)expressions must have exactly two branches whose patterns do not overlap; extensions to support these features are straightforward. As in Haskell, we sometimes omit the braces and semicolons if they are clear from the layout.
3.2 Type System
The typing judgment \({\varGamma };{\varDelta } \vdash {e} : {A}\), which reads that under environments \(\varGamma \) and \(\varDelta \), expression e has type A, is defined by the typing rules in Fig. 4. We use two environments: \(\varDelta \) (the bidirectional type environment) is for variables introduced by patternmatching through \(\underline{\mathbf {case}}\), and \(\varGamma \) for everything else. It is interesting to observe that \(\varDelta \) only holds pure datatypes, as the pattern variables of \(\underline{\mathbf {case}}\) have pure datatypes, while \(\varGamma \) holds any types. We assume that the variables in \(\varGamma \) and those in \(\varDelta \) are disjoint, and appropriate \(\alpha \)renaming has been done to ensure this. This separation of \(\varDelta \) from \(\varGamma \) does not affect typeability, but is key to our semantics and correctness proof (Sect. 4). Most of the rules are standard except \(\underline{\mathbf {case}}\); recall that we only use unidirectional constructors in patterns which have pure types, while the variables bound in the patterns are used as \(\varvec{\mathsf {B}}{}\)typed values in branch bodies.
4 Semantics of HOBiT Core
Recall that the unique strength of HOBiT is its ability to mix higherorder unidirectional programming with bidirectional programming. A consequence of this mixture is that we can no longer specify its semantics in the same way as other firstorder bidirectional languages such as [13], where two semantics—one for \( get \) and the other for \( put \)—suffice. This is because the category of lenses is believed to have no exponential objects [27] (and thus does not permit \(\lambda \)s).
4.1 Basic Idea: Staging
Our solution to this problem is staging [5], which separates evaluation into two stages: the unidirectional parts is evaluated first to make way for a bidirectional semantics, which only has to deal with the residual firstorder programs. As a simple example, consider the expression \((\lambda z.z) \;(x \mathbin {\underline{:}}((\lambda w.w) \;y) \mathbin {\underline{:}}\underline{[\,]})\). The firststage evaluation, \(e \Downarrow _\mathrm {U} E\), eliminates \(\lambda \)s from the expression as in \( (\lambda z.z) \;(x \mathbin {\underline{:}}((\lambda w.w) \;y) \mathbin {\underline{:}}\underline{[\,]}) \Downarrow _\mathrm {U} x \mathbin {\underline{:}}y \mathbin {\underline{:}}\underline{[\,]} \). Then, our bidirectional semantics will be able to treat the residual expression as a lens between value environments and values, following [13, 20]. Specifically, we have the \( get \) evaluation relation \(\mu \vdash _\mathrm {G} E \Rightarrow v\), which computes the value v of E under environment \(\mu \) as usual, and the \( put \) evaluation relation \(\mu \vdash _\mathrm {P} v \Leftarrow E \dashv \mu '\), which computes an updated environment \(\mu '\) for E from the updated view v and the original environment \(\mu \). In pseudo syntax, it can be understood as \( put \;E \;\mu \;v = \mu '\), where \(\mu \) represents the original source and \(\mu '\) the new source.
It is worth mentioning that a complete separation of the stages is not possible due to the combination of \(\mathbf {fix}\) and \(\underline{\mathbf {case}}\), as an attempt to fully evaluate them in the first stage will result in divergence. Thus, we delay the unidirectional evaluation inside \(\underline{\mathbf {case}}\) to allow \(\mathbf {fix}\), and consequently the three evaluation relations (unidirectional, \( get \), and \( put \)) are mutually dependent.
4.2 Three Evaluation Relations: Unidirectional, \( get \) and \( put \)
Bidirectional \(\mathbf {(}{\varvec{get}}\, \mathbf {and}\,{\varvec{put}}\mathbf {)}\) Evaluation Relations. The \( get \) and \( put \) evaluation relations, \(\mu \vdash _\mathrm {G} E \Rightarrow v\) and \(\mu \vdash _\mathrm {P} v \Leftarrow E \dashv \mu '\), are defined so that they together form a lens.
Our solution to this problem, which follows from [21, 22, 23, 29], is to allow \( put \) to return value environments containing only bindings that are relevant for the residual expressions under evaluation. For example, we have \(\mu \vdash _\mathrm {P} 3 \Leftarrow x \dashv \left\{ x = 3\right\} \), and \(\mu \vdash _\mathrm {P} 4 : [\,] \Leftarrow y \mathbin {\underline{:}}\underline{[\,]} \dashv \left\{ y = 4\right\} \). Then, we can merge the two value environments \(?_1 = \left\{ x = 3\right\} \) and \(?_2 = \left\{ y = 4\right\} \) to obtain the expected result \(\left\{ x = 3, y = 4\right\} \). As a remark, this seemingly simple solution actually has a nontrivial effect on the reasoning of wellbehavedness. We defer a detailed discussion on this to Sect. 4.3.
The \( put \) evaluation rule of \(\underline{\mathbf {case}}\) shown in Fig. 6 is more involved. In addition to checking which branch should be chosen by using exit conditions, we need two rules to handle the cases with and without branch switching. Basically, the branch to be taken in the backwards direction is decided first, by the \( get \)evaluation of the case condition \(E_{0}\) and the checking of the exit condition \(E_i'\) against the updated view v. After that, the body of the chosen branch \(e_{i}\) is firstly unidirectionally evaluated, and then its residual expression \(E_{i}\) is \( put \)evaluated. The last step is \( put \)evaluation of the casecondition \(E_{0}\). When branch switching happens, there is the additional step of applying the reconciliation function \(E''_{j}\).
4.3 Correctness
We establish the correctness of HOBiT Core: Open image in new window is wellbehaved for closed e of type \(\varvec{\mathsf {B}}{\sigma } \rightarrow \varvec{\mathsf {B}}{\tau }\). Recall that \( Lens \;{S}\;{V}\) is a set of lenses \(\ell \), where \( get \;\ell \in S \rightarrow V\) and \( put \;\ell \in S \rightarrow V \rightarrow S\). We only provide proof sketches in this subsection due to space limitation.
\(\varvec{\mathrel {\preceq }}\)wellbehavedness. Recall that in the previous subsection, we allow environments to be weakened during \( put \)evaluation. Since not all variables in a source may appear in the view, during some intermediate evaluation steps (for example within \(\underline{\mathbf {case}}\)branches) the weakened environment may not be sufficient to fully construct a new source. Recall that, in \(\mu \vdash _\mathrm {P} v \Leftarrow e \dashv \mu '\), \(\mathsf {dom}(\mu ')\) can be smaller than \(\mathsf {dom}(\mu )\), a gap that is fixed at a later stage of evaluation by merging (\(\mathbin {\curlyvee }\)) and defaulting (\(\triangleleft \)) with other environments. This technique reduces conflicts, but at the same time complicates the compositional reasoning of correctness. Specifically, due to the potentially missing information in the intermediate environments, wellbehavedness may be temporally broken during evaluation. Instead, we use a variant of wellbehavedness that is weakening aware, which will then be used to establish the standard wellbehavedness for the final result.
Definition 1
We write \( Lens ^\mathrm {\mathrel {\preceq }{}wb} \;S \;V\) for the set of lenses in \( Lens \;{S}\;{V}\) that are \(\mathrel {\preceq }\)wellbehaved. In this section, we only consider the case where S and V are value environments and firstorder values, where value environments are ordered by weakening (\(\mu \mathrel {\preceq }\mu '\) if \(\mu (x) = \mu '(x)\) for all \(x \in \mathsf {dom}(\mu )\)), and \((\mathrel {\preceq }) = (=)\) for firstorder values. In Sect. 5.2 we consider a slightly more general situation.
The \(\mathrel {\preceq }\)wellbehavedness is a generalisation of the ordinary wellbehavedness, as it coincides with the ordinary wellbehavedness when \((\mathrel {\preceq }) = (=)\).
Theorem 1
For S and V with \((\mathrel {\preceq }) = (=)\), a lens \(\ell \in Lens \;{S}\;{V}\) is \(\mathrel {\preceq }\)wellbehaved iff it is wellbehaved. \(\square \)
Kripke Logical Relation. The key step to prove the correctness of HOBiT Core is to prove that \(\mathcal {L}_{0}[\![ E ]\!]\) is always \(\mathrel {\preceq }\)wellbehaved if E is an evaluation result of a welltyped expression e. The basic idea is to prove this by logical relation that expression e of type \(\varvec{\mathsf {B}}{\sigma }\) under the context \(\varDelta \) is evaluated to E, assuming termination, such that \(\mathcal {L}_{0}[\![ E ]\!]\) is a \(\mathrel {\preceq }\)wellbehaved lens between \([\![\varDelta ]\!]\) and \([\![\sigma ]\!]\).
Usually a logical relation is defined only by induction on the type. In our case, as we need to consider \(\varDelta \) in the interpretation of \(\varvec{\mathsf {B}}{\sigma }\), the relation should be indexed by \(\varDelta \) too. However, naive indexing does not work due to substitutions. For example, we could define a (unary) relation \(\mathcal {E}_\varDelta (\varvec{\mathsf {B}}{\sigma })\) as a set of expressions that evaluate to “good” (i.e., \(\mathrel {\preceq }\)wellbehaved) lenses between (the semantics of) \(\varDelta \) and \(\sigma \), and \(\mathcal {E}_\varDelta (\varvec{\mathsf {B}}{\sigma } \rightarrow \varvec{\mathsf {B}}{\tau })\) as a set of expressions that evaluate to “good” functions that map good lenses between \(\varDelta \) and \(\sigma \) to those between \(\varDelta \) and \(\tau \). This naive relation, however, does not respect substitution, which can substitute a value obtained from an expression typed under \(\varDelta \) to a variable typed under \(\varDelta '\) such that \(\varDelta \subseteq \varDelta '\), where \(\varDelta \) and \(\varDelta '\) need not be the same. With the naive definition, good functions at \(\varDelta \) need not be good functions at \(\varDelta '\), as a good lens between \(\varDelta '\) and \(\sigma \) is not always a good lens between \(\varDelta \) and \(\sigma \).
To remedy the situation, inspired by the denotation semantics in [24], we use Kripke logical relations [18] where worlds are \(\varDelta \)s.
Definition 2
The notable difference from ordinary logical relations is the definition of Open image in new window where we consider an arbitrary \(\varDelta '\) such that \(\varDelta \subseteq \varDelta '\). This is the key to state Open image in new window if \(\varDelta \subseteq \varDelta '\). Notice that Open image in new window for any \(\varDelta \).
We have the following lemmas.
Lemma 1
If \(\varDelta \subseteq \varDelta '\), Open image in new window implies Open image in new window . \(\square \)
Lemma 2
Open image in new window for any \(\varDelta \) such that \(\varDelta (x) = \sigma \). \(\square \)
Lemma 3
For any \(\sigma \) and \(\varDelta \), Open image in new window and Open image in new window . \(\square \)
Lemma 4
If Open image in new window and Open image in new window , then Open image in new window . \(\square \)
Lemma 5
Let \(\sigma \) and \(\tau \) be pure types and \(\varDelta \) a pure type environment. Suppose that Open image in new window for \({\varDelta _{i}} \vdash {p_{i}} : {\sigma }\) (\(i = 1,2\)), and that Open image in new window , Open image in new window and Open image in new window . Then, Open image in new window .
Proof
(Sketch). The proof itself is straightforward by case analysis. The key property is that \( get \) and \( put \) use the same branches in both proofs of \({\mathrel {\preceq }}{\text {}}{} \mathbf{Acceptability}\) and \({\mathrel {\preceq }}{\text {}}{} \mathbf{Consistency}\). Slight care is required for unidirectional evaluations of \(e_{1}\) and \(e_{2}\), and applications of \(E'_{1},E'_{2},E''_{1}\) and \(E''_{2}\). However, the semantics is carefully designed so that in the proof of \({\mathrel {\preceq }}{\text {}}{} \mathbf{Acceptability}\), unidirectional evaluations that happen in \( put \) have already happened in the evaluation of \( get \), and a similar discussion applies to \({\mathrel {\preceq }}{\text {}}{} \mathbf{Consistency}\). \(\square \)
As a remark, recall that we assumed \(\alpha \)renaming of \(p_{i}\) so that the disjoint unions (\(\uplus \)) in Fig. 6 succeed. This renaming depends on the \(\mu \)s received in \( get \) and \( put \) evaluations, and can be realised by using de Bruijn levels.
Lemma 6
(Fundamental Lemma). For \({\varGamma };{\varDelta } \vdash {e} : {A}\), for any \(\varDelta '\) with \(\varDelta \subseteq \varDelta '\) and Open image in new window , we have Open image in new window .
Proof
(Sketch). We prove the lemma by induction on typing derivation. For bidirectional constructs, we just apply the above lemmas appropriately. The other parts are rather routine. \(\square \)
Now we are ready to state the correctness of our construction of lenses.
Corollary 1
If \({\varepsilon };{\varepsilon } \vdash {e} : {\varvec{\mathsf {B}}{\sigma } \rightarrow \varvec{\mathsf {B}}{\tau }}\), then Open image in new window . \(\square \)
Lemma 7
If Open image in new window , Open image in new window (if defined) is in Open image in new window (and thus wellbehaved by Theorem 1). \(\square \)
Theorem 2
If \({\varepsilon };{\varepsilon } \vdash {e} : {\varvec{\mathsf {B}}{\sigma } \rightarrow \varvec{\mathsf {B}}{\tau }}\), then Open image in new window (if defined) is wellbehaved. \(\square \)
5 Extensions
Before presenting a larger example, we discuss a few extensions of HOBiT Core which facilitate programming.
5.1 InLanguage Lens Definition
5.2 Lens Combinators as Language Constructs
In this paper, we have focused on the \(\underline{\mathbf {case}}\) construct, which is inspired by the \( cond \) combinator [7]. Although \( cond \) is certainly an important lens combinator, it is not the only one worth considering. Actually, we can obtain language constructs from a number of lens combinators including those that take care of alignment [2]. For the sake of demonstration, we outline the derivation of a simpler example Open image in new window . As the construction depends solely on types, we purposely leave the combinator abstract.
The combinator preserves \(\mathrel {\preceq }\)wellbehavedness, and thus \(\underline{\mathbf {comb}}_\mathrm {bad}\) guarantees correctness. However, as discussed extensively in the case of \(\underline{\mathbf {case}}\), this “closedness” requirements prevents flexible use of variables and creates a major obstacle in programming.
Even better, the parametrised \( pcomb \) can be systematically constructed from the definition of \( comb \). For \( comb \), it is typical that \( get \;( comb \;\ell )\) only uses \( get \;\ell \), and \( put \;( comb \;\ell )\) uses \( put \;\ell \); that is, \( comb \) essentially consists of two functions of types Open image in new window and Open image in new window . Then, we can obtain \( pcomb \) of the above type merely by “monad”ifying the two functions: using the reader monad \(T \rightarrow {}\) for the former and the composition of the reader and writer monads \(T \rightarrow ({}, T)\) backwards for the latter suffice to construct \( pcomb \).
A remaining issue is to ensure that \( pcomb \) preserves \(\mathrel {\preceq }\)wellbehavedness, which ensures Open image in new window under the assumptions Open image in new window and Open image in new window . Currently, such a proof has to be done manually, even though \( comb \) preserves wellbehavedness and \( pcomb \) is systematically constructed. Whether we can lift the correctness proof for \( comb \) to \( pcomb \) in a systematic way will be an interesting future exploration.
5.3 Guards
Guards used for branching are merely syntactic sugar in ordinary unidirectional languages such as Haskell. But interestingly, they actually increase the expressive power of HOBiT, by enabling inspection of updatable values without making the inspection functions bidirectional.
Here, \(\mathopen {\underline{(}} , \mathclose {\underline{)}}\) is the bidirectional version of the pair constructor. The exit condition \( isRight \) checks whether a value is headed by the constructor \(\mathsf {Right}\), and \( isLeft \) by \(\mathsf {Left}\). Notice that the backwards transformation of \( eqCheck \) fails when the updated view is \(\mathsf {Left}\;(v,v)\) for some v.
5.4 Syntax Sugar for Reconciliation Functions
5.5 Inference of Exit Conditions
It is possible to infer exit conditions from their surrounding contexts; an idea that has been studied in the literature of invertible programming [11, 20], and may benefit from range analysis.
Our prototype implementation adopts a very simple inference that constructs an exit condition Open image in new window for each branch, where \(p_e\) is the skeleton of the branch body e, constructed by replacing bidirectional constructors with the unidirectional counterparts, and nonconstructor expressions with Open image in new window . For example, from \(a \mathbin {\underline{:}} appendB \;x' \;y\), we obtain the pattern Open image in new window . This embarrassingly simple inference has proven to be handy for developing larger HOBiT programs as we will see in Sect. 6.
6 An Involved Example: Desugaring
In this section, we demonstrate the programmability of HOBiT using the example of bidirectional desugaring [26]. Desugaring is a standard process for most programming languages, and making it bidirectional allows information in desugared form to be propagated back to the surface programs. It is argued convincingly in [26] that such bidirectional propagation (coined resugaring) is effective in mapping reduction sequences of desugared programs into those of the surface programs.
Variables are represented as de Bruijn indices.
We start with an auxiliary function \( compos \) [4] in Fig. 7, which is a useful building block for defining shifting and desugaring. We have omitted the straightforward exit conditions; they will be inferred as explained in Sect. 5.5. The function \( mapB \) is the bidirectional map. The reconciliation function \( recE \) tries to preserves as much source structure as possible by reusing the original source e. Here, \( arities \,{:}{:}\,[( Name , Int )]\) maps operator names to their arities (i.e. Open image in new window ). The function \( shift \) is the standard unidirectional shifting function. We omit its definition as it is similar to the bidirectional version in Fig. 8. Note that \(\mathrel {\underline{\mathbf {default}}}\) is syntactic sugar for reconciliation function introduced in Sect. 5.4. Here, \( incB \) is the bidirectional increment function defined in Sect. 5.1. Thanks to \( composB \), we only need to define the interesting parts in the definitions of \( shiftB \) and \( desugarB \). The reconciliation functions \( recE \) and \( toOp \) try to keep as much source information as possible, which enables the behaviour that the backwards execution produces “not” and “or” in the sugared form only if the original expression has the sugar.
As the AST structure of the view is changed, all of the three cases require branchswitching in the backwards executions; our program handles it with ease. For \( view _2\), the toplevel expression \(\mathsf {EIf} \;\mathsf {EFalse} \;\mathsf {EFalse}~...\) does not have a corresponding sugared form. Our program keeps the top level unchanged, and proceeds to the subexpression with correct resugaring, a behaviour enabled by the appropriate use of reconciliation function (the first line of \( recE \) for this particular case) in \( composB \).
If we were to present the above results as the evaluation steps in the surface language, one may argue that the second result above does not correspond to a valid evaluation step in the surface language. In [26], AST nodes introduced in desugaring are marked with the information of the original sugared syntax, and resugaring results containing the marked nodes will be skipped, as they do not correspond to any reduction step in the surface language. The marking also makes the backwards behaviour more predictable and stable for drastic changes on the view, as the desugaring becomes injective with this change. This technique is orthogonal to our exploration here, and may be combined with our approach.
7 Related Work
Controlling Backwards Behaviour. In addition to \( put \in S \rightarrow V \rightarrow S\), many lens languages [3] supply a \(\textit{create} \in V \rightarrow S\) (which is in essence a rightinverse of get) to be used when the original source data is unavailable. This happens when new data is inserted in the view, which does not have any corresponding source for put to execute, or when branchswitching happens but with no reconciliation function available. Being a rightinverse, \(\textit{create}\) does not fail (assuming it terminates), but since it is not guided by the original source, the results are more arbitrary. We do not include \(\textit{create}\) in HOBiT, as it complicates the system without offering obvious benefits. Our branchswitching facilities are perfectly capable of handling missing source data via reconciliation functions.
Using exit conditions in branching constructs for backwards evaluation can be found in a number of related fields: bidirectional transformation [7], reversible computation [34] and program inversion [11, 20]. Our design of \(\underline{\mathbf {case}}\) is inspired by the \(\textit{cond}\) combinator in the lens framework [7] and the ifstatement in Janus [34]. A similar combinator is \(\textit{Case}\) in BiGUL [16], where a branch has a function performing a similar role as an exit condition, but taking the original source in addition. This difference makes \(\textit{Case}\) more expressive than \(\textit{cond}\); for example, \(\textit{Case}\) can implement matching lenses [2]. Our design of \(\underline{\mathbf {case}}\) follows \(\textit{cond}\) for its relative simplicity, but the same underlying technique can be applied to \(\textit{Case}\) as mentioned in Sect. 5.2. In the context of bidirectionalization [19, 29, 30] there is the idea of “Plugins” [31] that are similar to reconciliation functions in the sense that source values can be adapted to direct backwards execution.
Applicative Lenses. The applicative lens framework [21, 23] provides a way to use \(\lambda \)abstraction and function application as in normal functional programming to compose lenses. Note that this use of “applicative” refers to the classical applicative (functional) programming style, and is not directly related to Applicative functor in Haskell. In this sense, it shares a similar goal to us. But crucially, applicative lens lacks HOBiT ’s ability to allow \(\lambda \)bound variables to be used freely, and as a result suffers from the same limitation of lens languages. There are also a couple of technical differences between applicative lens and our work: applicative lens is based on Yoneda embedding while ours is based on separating \(\varGamma \) and \(\varDelta \) and having three semantics (Sect. 4); and applicative lens is implemented as an embedded DSL, while HOBiT is given as a standalone language. Embedded implementation of HOBiT is possible, but a typecorrect embedding would expose the handling of environment \(\varDelta \) to programmers, which is undesirable.
Lenses and Their Extensions. As mentioned in Sect. 1, the most common way to construct lenses is by using combinators [3, 7, 8], in which lenses are treated as opaque objects and composed by using lens combinators. Our goal in this paper is to enhance the programmability of lens programming, while keeping its expressive power as possible. In HOBiT, primitive lenses can be represented as functions on \(\varvec{\mathsf {B}}{}\)typed values (Sect. 5.1), and lens combinators satisfying certain conditions can be represented as language construct with binders (Sect. 5.2), which is at least enough to express the original lenses in [7].
Among extensions of the lens language [2, 3, 7, 8, 9, 16, 17, 27, 32], there exists a few that extend the classical lens model [7], namely quotient lenses [8], symmetric lenses [14], and editbased lenses [15]. A natural question to ask is whether our development, which is based on the classical lenses, can be extended to them. The answer depends on treatment of value environments \(\mu \) in \( get \) and \( put \). In our semantics, we assume a nonlinear system as we can use the same variable in \(\mu \) any number of times. This requires us to extend the classical lens to allow merging (\(\mathbin {\curlyvee }\)) and defaulting (\(\triangleleft \)) operations in \( put \) with \(\mathrel {\preceq }\)wellbehavedness, but makes the syntax and type system of HOBiT simple, and HOBiT free from the design issues of linear programming languages [25]. Such extension of lenses would be applicable to some kinds of lens models, including quotient lenses and symmetric lenses, but its applicability is not clear in general. Also, we want to mention that allowing duplications in bidirectional transformation is still open, as it essentially entails multiple views and the synchronization among them.
8 Conclusion
We have designed HOBiT, a higherorder bidirectional programming language in which lenses are represented as functions and lens combinators are represented as language constructs with binders. The main advantage of HOBiT is that users can program in a style similar to conventional functional programming, while still enjoying the benefits of lenses (i.e., the expressive power and wellbehavedness guarantee). This has allowed us to program realistic examples with relative ease.
HOBiT for the first time introduces a truly “functional” way of constructing bidirectional programs, which opens up a new area of future explorations. Particularly, we have just started to look at programming techniques in HOBiT. Moreover, given the resemblance of HOBiT code to that in conventional languages, the application of existing programming tools becomes plausible.
Footnotes
 1.
Haskell’s \( lines \)’s behaviour is a bit more complicated as it returns \([\,]\) if and only if the input is \(\texttt {""}\). This behaviour can be achieved by calling \( linesB \) only when the input list is nonempty.
Notes
Acknowledgements
We thank Shinya Katsumata, Makoto Hamana and Kazuyuki Asada for their helpful comments on the category theory and denotational semantics, from which our formal discussions originate. The work was partially supported by JSPS KAKENHI Grant Numbers 24700020, 15K15966, and 15H02681.
References
 1.Bancilhon, F., Spyratos, N.: Update semantics of relational views. ACM Trans. Database Syst. 6(4), 557–575 (1981). https://doi.org/10.1145/319628.319634CrossRefzbMATHGoogle Scholar
 2.Barbosa, D.M.J., Cretin, J., Foster, N., Greenberg, M., Pierce, B.C.: Matching lenses: alignment and view update. In: Hudak, P., Weirich, S. (eds.) ICFP, pp. 193–204. ACM (2010). https://doi.org/10.1145/1863543.1863572
 3.Bohannon, A., Foster, J.N., Pierce, B.C., Pilkiewicz, A., Schmitt, A.: Boomerang: resourceful lenses for string data. In: Necula, G.C., Wadler, P. (eds.) POPL, pp. 407–419. ACM (2008). https://doi.org/10.1145/1328438.1328487
 4.Bringert, B., Ranta, A.: A pattern for almost compositional functions. J. Funct. Program. 18(5–6), 567–598 (2008). https://doi.org/10.1017/S0956796808006898CrossRefzbMATHGoogle Scholar
 5.Davies, R., Pfenning, F.: A modal analysis of staged computation. J. ACM 48(3), 555–604 (2001). https://doi.org/10.1145/382780.382785MathSciNetCrossRefzbMATHGoogle Scholar
 6.Fegaras, L.: Propagating updates through XML views using lineage tracing. In: Li, F., Moro, M.M., Ghandeharizadeh, S., Haritsa, J.R., Weikum, G., Carey, M.J., Casati, F., Chang, E.Y., Manolescu, I., Mehrotra, S., Dayal, U., Tsotras, V.J. (eds.) ICDE, pp. 309–320. IEEE (2010). https://doi.org/10.1109/ICDE.2010.5447896
 7.Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Combinators for bidirectional tree transformations: a linguistic approach to the viewupdate problem. ACM Trans. Program. Lang. Syst. 29(3) (2007). https://doi.org/10.1145/1232420.1232424CrossRefGoogle Scholar
 8.Foster, J.N., Pilkiewicz, A., Pierce, B.C.: Quotient lenses. In: Hook, J., Thiemann, P. (eds.) ICFP, pp. 383–396. ACM (2008). https://doi.org/10.1145/1411204.1411257
 9.Foster, N., Matsuda, K., Voigtländer, J.: Three complementary approaches to bidirectional programming. In: Gibbons, J. (ed.) Generic and Indexed Programming. LNCS, vol. 7470, pp. 1–46. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642322020_1CrossRefzbMATHGoogle Scholar
 10.Glück, R., Kawabe, M.: A program inverter for a functional language with equality and constructors. In: Ohori, A. (ed.) APLAS 2003. LNCS, vol. 2895, pp. 246–264. Springer, Heidelberg (2003). https://doi.org/10.1007/9783540400189_17CrossRefzbMATHGoogle Scholar
 11.Glück, R., Kawabe, M.: Revisiting an automatic program inverter for lisp. SIGPLAN Not. 40(5), 8–17 (2005). https://doi.org/10.1145/1071221.1071222CrossRefGoogle Scholar
 12.Hegner, S.J.: Foundations of canonical update support for closed database views. In: Abiteboul, S., Kanellakis, P.C. (eds.) ICDT 1990. LNCS, vol. 470, pp. 422–436. Springer, Heidelberg (1990). https://doi.org/10.1007/3540535071_93CrossRefGoogle Scholar
 13.Hidaka, S., Hu, Z., Inaba, K., Kato, H., Matsuda, K., Nakano, K.: Bidirectionalizing graph transformations. In: Hudak, P., Weirich, S. (eds.) ICFP, pp. 205–216. ACM (2010). https://doi.org/10.1145/1863543.1863573
 14.Hofmann, M., Pierce, B.C., Wagner, D.: Symmetric lenses. In: Ball, T., Sagiv, M. (eds.) POPL, pp. 371–384. ACM (2011). https://doi.org/10.1145/1926385.1926428CrossRefGoogle Scholar
 15.Hofmann, M., Pierce, B.C., Wagner, D.: Edit lenses. In: Field, J., Hicks, M. (eds.) POPL, pp. 495–508. ACM (2012). https://doi.org/10.1145/2103656.2103715
 16.Hu, Z., Ko, H.S.: Principles and practice of bidirectional programming in BiGUL. Oxford Summer School on Bidirectional Transformations (2017). https://bitbucket.org/prl_tokyo/bigul/raw/master/SSBX16/tutorial.pdf. Accessed 18 Oct 2017
 17.Hu, Z., Mu, S.C., Takeichi, M.: A programmable editor for developing structured documents based on bidirectional transformations. In: Heintze, N., Sestoft, P. (eds.) PEPM, pp. 178–189. ACM (2004). https://doi.org/10.1145/1014007.1014025
 18.Jung, A., Tiuryn, J.: A new characterization of lambda definability. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 245–257. Springer, Heidelberg (1993). https://doi.org/10.1007/BFb0037110CrossRefzbMATHGoogle Scholar
 19.Matsuda, K., Hu, Z., Nakano, K., Hamana, M., Takeichi, M.: Bidirectionalization transformation based on automatic derivation of view complement functions. In: Hinze, R., Ramsey, N. (eds.) ICFP, pp. 47–58. ACM (2007). https://doi.org/10.1145/1291151.1291162
 20.Matsuda, K., Mu, S.C., Hu, Z., Takeichi, M.: A grammarbased approach to invertible programs. In: Gordon, A.D. (ed.) ESOP 2010. LNCS, vol. 6012, pp. 448–467. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642119576_24CrossRefGoogle Scholar
 21.Matsuda, K., Wang, M.: Applicative bidirectional programming: mixing lenses and semantic bidirectionalization. J. Funct. Program. Accepted 14 Feb 2018Google Scholar
 22.Matsuda, K., Wang, M.: “Bidirectionalization for free” for monomorphic transformations. Sci. Comput. Program. 111(1), 79–109 (2014). https://doi.org/10.1016/j.scico.2014.07.008CrossRefGoogle Scholar
 23.Matsuda, K., Wang, M.: Applicative bidirectional programming with lenses. In: Fisher, K., Reppy, J.H. (eds.) ICFP, pp. 62–74. ACM (2015). https://doi.org/10.1145/2784731.2784750
 24.Moggi, E.: Functor categories and twolevel languages. In: Nivat, M. (ed.) FoSSaCS 1998. LNCS, vol. 1378, pp. 211–225. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0053552CrossRefzbMATHGoogle Scholar
 25.Morris, J.G.: The best of both worlds: linear functional programming without compromise. In: Garrigue, J., Keller, G., Sumii, E. (eds.) ICFP, pp. 448–461. ACM (2016). https://doi.org/10.1145/2951913.2951925
 26.Pombrio, J., Krishnamurthi, S.: Resugaring: lifting evaluation sequences through syntactic sugar. In: O’Boyle, M.F.P., Pingali, K. (eds.) PLDI, pp. 361–371. ACM (2014). https://doi.org/10.1145/2594291.2594319CrossRefGoogle Scholar
 27.Rajkumar, R., Foster, N., Lindley, S., Cheney, J.: Lenses for web data. ECEASST 57 (2013). https://doi.org/10.14279/tuj.eceasst.57.879
 28.Stevens, P.: A landscape of bidirectional model transformations. In: Lämmel, R., Visser, J., Saraiva, J. (eds.) GTTSE 2007. LNCS, vol. 5235, pp. 408–424. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540886433_10CrossRefGoogle Scholar
 29.Voigtländer, J.: Bidirectionalization for free! (pearl). In: Shao, Z., Pierce, B.C. (eds.) POPL, pp. 165–176. ACM (2009). https://doi.org/10.1145/1480881.1480904CrossRefGoogle Scholar
 30.Voigtländer, J., Hu, Z., Matsuda, K., Wang, M.: Combining syntactic and semantic bidirectionalization. In: Hudak, P., Weirich, S. (eds.) ICFP, pp. 181–192. ACM (2010). https://doi.org/10.1145/1863543.1863571
 31.Voigtländer, J., Hu, Z., Matsuda, K., Wang, M.: Enhancing semantic bidirectionalization via shape bidirectionalizer plugins. J. Funct. Program. 23(5), 515–551 (2013). https://doi.org/10.1017/S0956796813000130MathSciNetCrossRefzbMATHGoogle Scholar
 32.Wang, M., Gibbons, J., Matsuda, K., Hu, Z.: Refactoring pattern matching. Sci. Comput. Program. 78(11), 2216–2242 (2013). https://doi.org/10.1016/j.scico.2012.07.014CrossRefGoogle Scholar
 33.Xiong, Y., Liu, D., Hu, Z., Zhao, H., Takeichi, M., Mei, H.: Towards automatic model synchronization from model transformations. In: Stirewalt, R.E.K., Egyed, A., Fischer, B. (eds.) ASE, pp. 164–173. ACM (2007). https://doi.org/10.1145/1321631.1321657
 34.Yokoyama, T., Axelsen, H.B., Glück, R.: Principles of a reversible programming language. In: Ramírez, A., Bilardi, G., Gschwind, M. (eds.) CF, pp. 43–54. ACM (2008). https://doi.org/10.1145/1366230.1366239
 35.Yu, Y., Lin, Y., Hu, Z., Hidaka, S., Kato, H., Montrieux, L.: Maintaining invariant traceability through bidirectional transformations. In: Glinz, M., Murphy, G.C., Pezzè, M. (eds.) ICSE, pp. 540–550. IEEE (2012). https://doi.org/10.1109/ICSE.2012.6227162
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.