Semantic Equivalence of Task-Oriented Programs in TopHat

Task-oriented programming (TOP) is a new programming paradigm designed for describing multi-user workﬂows. It is implemented in the iTasks framework, in the functional programming language Clean. To reason formally about iTasks programs, a formal language called d top (TopHat) has been deﬁned, together with its operational semantics. For proving properties about task-oriented programs, it is desirable to have a deﬁnition for the semantic equivalence of two d top -programs. This thesis aims to answer this question. We show that a task can be in either one of ﬁve states after normalisation, and for every two tasks in the same state, we deﬁne what it means for them to be semantically equivalent. Using this deﬁnition, we deﬁne a number of properties we believe hold for d top -programs. Amongst those, we show that the Task operation on types in d top cannot be a monad.


Introduction
Task-oriented programming (TOP) is a new programming paradigm designed for developing distributed interactive multi-user systems. In this programming paradigm, the concept of a "task" plays a central role. A task is a unit of work assigned to some user, and consists of two parts: a description of the work that should be done, and a typed interface that defines the type of the task value that it returns. Tasks are described in an abstract, declarative manner, and from this abstract description, TOP automatically generates a GUI. It also takes care of the client-server communication that is needed for users to work together on tasks. TOP allows programmers to define workflows which describe what tasks should be executed by its users, without having to worry about how this is achieved.
User collaboration is a central concept in TOP. The different ways in which users can collaborate are captured by task combinators. By using these combinators, TOP-programmers can construct larger tasks from smaller ones in several ways. There is sequential composition, which allows tasks to be executed one after the other. And there is parallel composition, which allows tasks to be executed in parallel at the same time. For parallel composition, it is possible to either combine the results, or to conditionally continue with either one of two tasks.
In order to collaborate, users also need to be able to communicate with each other, and with the system. Using task combinators, it is already possible to pass along data from one task onto the next. For communication with the outside world, there are editors. Editors provide interaction with the environment via input events. They are typed containers which remember the last value that has been sent to them, and users can communicate with the system through these editors. Furthermore, they allow users to view and edit shared data sources, which are mutable references whose changes are immediately visible to all other tasks watching them.
Another important component of tasks is that they are typed. This type is important to determine the type of the task values that are communicated to the environment. Not all tasks have a value, and a task does not produce just one value when it is complete. Instead, a task's value is continually updated while the work takes place, and can be observed at any point during execution. Moreover, it may be possible that a task never completes, and that its value never reaches a stable state. A task's value reflects a task's current progress. They can be inspected by other tasks to base decisions on, which in turn can impact the things users can see or do.
Finally, the TOP language is modular: tasks are composed of smaller tasks, and can be arguments or results of functions. This allows programmers to re-use tasks, and to model their own collaboration patterns [1,6].

iTasks
TOP describes in an abstract way what work should be done by the system and its users. It does not describe how this should be done, this question should be answered by the TOP language implementation. The iTasks framework is an implementation of TOP, written in the pure and lazy functional programming language Clean. It is implemented as a shallowly embedded domain-specific language, which means that it inherits features from its host language Clean. Amongst these features is a strong typing system, and because Clean is a functional language, it allows task combinators to be expressed as functions. From the high level description of the tasks, iTasks generates a web application that is able to execute the described tasks. It takes care of generating a GUI, and of coordinating the tasks in a distributed manner by using a client-server architecture. The server side of an iTasks application runs a web service to which users on a wide range of different devices can connect, and the client side realises the front-end components. This way, programmers of iTasks-programs need not be concerned by lowerlevel implementation details. iTasks has shown itself effective in the past for the implementation of interactive, distributed, workflow applications [1].

Research question
Because iTasks has been designed for developing real-world applications, reasoning formally about iTasks programs is hard. The paper TopHat: A formal foundation for task-oriented programming introduces top (TopHat), a formal language plus operational semantics for reasoning about task-oriented programs. A follow up paper, TopHat Next: even more stylish task-oriented programming, is currently being written [7]. This thesis uses these two papers as starting point. Our research question is the following: When are two programs in top semantically equivalent?

STRUCTURE CHAPTER 1. INTRODUCTION
Furthermore, if we can define such a notion of semantic equivalence for task-oriented programs, what interesting properties can we prove (or disprove)? An example of an interesting property is whether the monad laws hold for the step combinator. Showing that certain equalities hold for taskoriented programs could prove useful for the iTasks system in the future. If we know that one task-oriented program is semantically equivalent to another, then we know that we can substitute one for the other, without changing the meaning of the program. Which in turn could be useful for doing compiler optimizations.

Structure of this thesis
Chapter 2 will explain the operational semantics of top. We will mostly follow the semantics presented in TopHat Next [7], but we omit some language constructs to ease our definition of semantic equivalence in the subsequent chapter. We do however keep all language features that capture the essence of TOP. In Chapter 3, we will give a definition for semantic equivalence in top, along with its motivation. We show that a task can be in either one of five states after normalisation, and for every two tasks in the same state, we define what it means for them to be semantically equivalent. Additionally, we present some properties of top-programs that we claim are true or false according to our definition. Finally, Chapter 4 will conclude this thesis.

Chapter 2
TopHat top (TopHat) is a formal language for reasoning about task-oriented programs. It is described by a layered operational semantics, consisting of multiple big-step semantic functions for reducing expressions, and two labelled transition systems for handling user inputs. Its main layers are evaluation, normalisation, and interaction. To make clear which features come from TOP and which features come from functional programming, top is separated into a task language and an underlying host language [6,7]. The host language will be described in Section 2.1. We will give its syntax, typing rules, and evaluation semantics. The task language, which is embedded into the host language, will be presented in Section 2.2. We will explain the task constructs, give their typing rules, and give the normalisation and interaction semantics. Finally, Section 2.3 will discuss some larger examples.

Syntax
The host language of top is a simply typed λ-calculus, extended with some basic types. The grammar given in Figure 2.1 defines the syntax of top. Expressions e can be lambda expressions, variables, locations, branching, unit values, tuples, and constants for booleans, integers and strings. Booleans can be either True or False, integers use their decimal notation, and strings are enclosed by double quotation marks. There are a number of operations possible on expressions. There are equational operators (<, ≤, ≡, , ≥, >), logical operators for boolean expressions (¬, ∧, ∨), and numerical operations for integers (+, −, ×, /). Locations l are used for references and shared editors, which will be defined later in Section 2.2.11. They are not meant to be used by the programmer directly. Additionally, expressions can be pretasks. Pretasks define the constructs of the task language, which we will explain in Section 2.2. For now however, we will only focus on the host language, and postpone everything related to the task language until the next section.  Besides the expressions defined in the grammar, we will use the notation e 1 ; e 2 as an abbreviation for (λx : Unit. e 2 ) e 1 , where x is a fresh variable, and we will use the notation let x : τ = e 1 in e 2 as an abbreviation for (λx : τ . e 2 ) e 1 . This is allowed because our evaluation semantics, which we will present in Section 2.1.3, is strict.

Typing
Besides function types, the simply typed λ-calculus of the host language is extended with pairs, unit types, references, task types, and primitive types for booleans, integers, and strings. Additionally, there are basic types β, which contain only a subset of all types τ . Figure 2.2 shows the type grammar of top. Reference types are used for locations. We say that a location l is of type Ref β if it points to an expression of basic type β. To prevent recursive reference types, and to keep the language total, locations can only point to expressions of basic type. Task types Task τ are used for tasks. We will postpone the typing rules for tasks until Section 2.2.2.
Typing rules in top are of the form Γ, Σ ⊢ e : τ , which should be read as "in environment Γ and store typing Σ, the expression e has type τ ". The environment Γ is a mapping from variables to types, and is used in the rule T-Var to check the type of a variable, and updated in the rule T-Abs when using abstraction. The store typing Σ is a mapping from locations to types, and is used in the rule T-Loc to check the type of the expression that a Types τ : Basic types β ::= Unit | β 1 × β 2 | π -unit, product, primitive Primitive types π ::= Bool | Int | String -boolean, integer, string To say that an expression e is of type τ , we will often omit the environment Γ and store typing Σ, and simply write e : τ .

Semantics
Evaluating terms in the host language to values is handled by the evaluation semantics, where a value is an expression in the host language that cannot be reduced further. Values v can be lambda functions, pairs of values, unit, constants, locations, or tasks. Basic values b are a subset of values v that are of basic type β. The grammar for values is given in Figure 2 The host language evaluates expressions using a big-step semantics. We denote the evaluation of expression e to the value v by e ↓ v. Figure 2.5 gives the evaluation rules for expressions in the host language. The evaluation rules for the unary and binary operators are trivial, and so we omit them. We will postpone the evaluation of pretasks to tasks until Section 2.2.3.

Syntax
The task language of top is embedded into the host language described in the previous section. As Figure 2.1 shows, expressions in top can also be pretasks. A pretask p is a task that contains unevaluated sub-expressions. Pretasks can either be atomic tasks, or task combinators that compose larger pretasks from smaller ones. Figure 2.6 shows the grammar of pretasks.  Atomic tasks are atomic units of work that do not contain subtasks. These include internal values, failing, editors, reference creation and assign-ment. Internal values (■ e) simply return the expression e as result. A failing task ( ) stands for an impossible task. We describe internal value and failing in Section 2.2.7. Editors (⊠ n β, □ n e, ⊞ n e) provide communication with the environment. They are labeled by a label n, and require their input to be tagged by the same label. To allocate a label n within an expression e, we write νn.e. For label allocation, the programmer is only allowed to write νn.d. The pretasks νn.e and d are not meant to be used by the programmer directly. We explain editors in Section 2.2.4, and label allocation in Section 2.2.5. Finally, reference creation (ref e) and assignment (e 1 := e 2 ) enable the creation and modification of shared data. We describe references in Section 2.2.11.
Task combinators describe ways in which users can collaborate. They provide a means to combine smaller tasks into larger ones. There are several ways to do this. There is sequential composition, where the result of one task can be used in the next. This is captured by the step combinator (▶), which we will explain in Section 2.2.8. And there is parallel composition, which comes in two flavors. There is pairing (▶◀), or and-parallel, which combines the result of two tasks into one task. And there is choosing (♦), or orparallel, which chooses the result of the leftmost task that has a value. The pair and choice combinators are described in Section 2.2.10. Finally, there is the transform combinator (▲), which applies a function to the result of a task, resulting in a new task. We describe the transform combinator in Section 2.2.9.

Typing
The typing rules for pretasks are presented in Figure 2.7. A task t is of type Task τ if it should produce a result of type τ . Label allocation can be of any Task-type (T-Label). Editors are typed containers of type Task β for some basic type β, and only accept inputs of the same type β (T-Enter, T-Update, T-Change). By only allowing editors to edit basic values, this prevents higher-order tasks like νn. ⊠ n (Int → Int), νn. ⊠ n (νn. ⊠ n Int), or νn. □ n (λx : τ .x) from being defined, which have no meaning in top. Failing tasks can be of any Task-type (T-Fail). Pairing combines the result of the two tasks into a tuple (T-Pair), and choosing requires that both operands are of the same Task -type (T-Choose).
Step requires that the continuation e 2 on right-hand side is a function that takes the result from the task on the left-hand side and produces a new task (T-Step). Similarly, transform requires that the function e 1 on the left-hand side takes the result from the task on the right-hand side (T-Trans). Reference creation produces a task of type Task (Ref β) that contains the newly created location l of type Ref β (T-Share). Lastly, reference assignment returns a task containing the unit value as result, and thus is of type Task Unit (T-Assign).

Semantics
top uses a layered semantics to separate the semantics of the host language from the task language. So far, we have only seen the bottom layer of this semantics, namely evaluation ( ↓ ). Evaluation is responsible for evaluating expressions in the host language to values. Before we explain the layers on top of evaluation, we still need to describe how to evaluate pretasks, which are also expressions in the host language. Pretasks evaluate to task values t. Figure 2.8 shows the task grammar. Whereas pretasks can have unevaluated sub-expressions, tasks can only contain subtasks. Exceptions to this are transform (▲) and step (▶), where evaluation of the left-hand side respectively the right-hand side is delayed. Evaluation of pretasks to tasks is defined in Figure 2.9. Most task constructs simply evaluate their operands to values. Figure 2.10 shows a graphical representation of the semantic layers and their relation. After evaluation is done, a task is ready to be normalised. Normalisation is a big-step semantics that is responsible for reducing tasks until they are ready to accept input. We write t, σ ⇓ t ′ , σ ′ , δ ′ to denote the normalisation of task t in state σ to task t ′ in state σ ′ . The state σ is a mapping from locations to basic values. It keeps track of all references creates so far, and what value they currently hold. We will give the normalisation rules for each task construct in their respective subsections.

Figure 2.9: Evaluation rules for pretasks
Normalisation also returns a set δ ′ , which contains all locations whose value have been changed while normalisation took place. It will be used in the fixing semantics, which is defined on top of normalisation. Due to mutable references, the fixing semantics is required to make sure a task is fully normalised before any user interaction is allowed. We write t, σ, δ ⇒ t ′ , σ ′ to denote the fixing of task t in state σ given the set δ , resulting in task t ′ and state σ ′ . The fixing semantics will be explained in Section 2.2.11 on references.
For user interaction there are the handling and interaction semantics. Both semantics are small-step semantics that take an input event i for each step. For the handling semantics, we write t, σ i − → t ′ , σ ′ , δ ′ to denote that handling the input i in task t and state σ results in the task t ′ and state σ ′ . Similar to the normalisation semantics, the handling semantics also returns a set δ ′ , which contains all locations whose value have been changed while handling input. For the interaction semantics, we write t, σ i = ⇒ t ′ , σ ′ to denote that task t in state σ transitions to task t ′ in state σ ′ after the user interaction i. The interaction semantics makes use of both the fixing and handling semantics to make sure that, after user interaction, a task is fully reduced and ready to accept the next input. The handling and interaction semantics will be discussed in Section 2.2.6 on input handling.

Editors
Editors allow end users to interact with the system by entering and changing information. When a user sends an input event to an editor, the editor will update its current value to reflect the change. There are no output events. Instead, the current value of an editor can be observed and used in subsequent tasks. There are three types of editors in top:

unvalued valued shared
Enter Update Change • Empty editors (⊠ n β) or unvalued editors are editors that currently hold no value. They can be seen as an input prompt to the user to enter data. Empty editors are annotated with a basic type β, which means that only basic values of type β are accepted by the editor. Once an empty editors receives a valid input event, it becomes a filled editor containing the new data. • Filled editors (□ n b) or valued editors are editors that currently hold the basic value b. Filled editors can be seen as either outputting a value, or as an input prompt that comes with a default value. They can never be cleared, only updated with new values of the same type.
• Shared editors (⊞ n l) watch references. They allow the user to view and change shared values. Whenever a shared editor is updated, all shared editors watching the same reference will be updated as well.
The relation between the different editors is illustrated in the state diagram in Figure 2.11. Since editors are already fully reduced, no normalisation needs to be done. Figure 2.12 gives the normalisation rules for editors.

Label allocation
Editors are labeled, and require their input to be tagged by the same label. Label allocation (νn.e) ensures that the label n is bound within the expression e. Hence, it may be renamed within e, so long as there are no name conflicts. This is useful for introducing new labels within a task-program, without having to pre-assign all labels beforehand. This idea of label allocation closely follows the idea of channel allocation in π -calculus [2]. For normalisation, ν -expressions simply normalise their body. See the normalisation rule in Figure 2.13.

Figure 2.13: Normalisation rule for label allocation
Before sending input to a task-program, it should be in ν -standard form, which is defined as follows: where t ′ does not contain any free ν -expressions. We will say a ν -expression is free iff it does not occur within a λ-expression. In this case, we call t ′ the body of t.
We will also define the task observation N : Tasks → Booleans which, given a task t, returns true iff t is in ν -standard form. To bring a taskprogram in ν-standard form, we introduce ν -congruence. Two expressions e 1 and e 2 are said to be ν -congruent (denoted by e 1 ≡ e 2 ) if they are identical up to label names and the scope of ν -expressions. The rules by which to induce that two expressions are ν -congruent are given in Figure 2.14. The rule C-Alpha introduces α-conversion for ν -expressions. Similar to α-conversion in λ-calculus, this means that bound labels within e may be renamed, so long as there are no name conflicts. The rule C-Reorder says that the order of ν -expressions does not matter, and the rule C-Deallocate says that labels may be de-allocated once they are no longer in use. All other rules are scope extrusion rules, which extend the scope of a label n. This is allowed so long as n does not occur in the rest of the expression that is now included within the scope. We write n e to say that the expression e does not contain the label n. Because νn.e must be of type Task, we only apply scope extrusion to expressions whose sub-expressions can also be of type Task. This means that any ν 's occurring within the right-hand side of step, or within a lambda function, are not extruded until after the step is taken, or after the function is evaluated. This is fine, because such editors are not yet reachable by the user, and cannot receive input yet.
In addition to the congruence rules defined in Figure 2.14, we also need a set of rules that say that congruence can be applied recursively to subexpressions. For example, for label allocation we need a rule that says that if e ≡ e ′ , then νn.e ≡ νn.e ′ ; and for pairing we need a rule that says that if e 1 ≡ e ′ 1 , and e 2 ≡ e ′ 2 , then e 1 ▶◀ e 2 ≡ e ′ 1 ▶◀ e ′ 2 . These rules can easily be derived from the grammar, and so we will not give them here. We will simply say that, because ≡ is a congruence relation, it can be applied structurally to sub-expressions.

Input handling
Once a program is in ν-standard form, it is possible to send input to it. The input event E n b indicates that the input b should be entered into the editor with label n. To do this, it is useful to know what inputs a given task accepts. The observation function I returns the set of input events that are currently possible for a given task. Its definition is given in Figure 2.15. For the three editors, I returns all input events E n b where b is of the correct type. For label allocation and the task combinators, I is defined recursively. For all other tasks, I returns the empty set. We consider an input event i a valid input event for the task t iff i ∈ I(t).

Figure 2.15: Inputs observation on tasks
Finally, we are able to define how input events should be handled by tasks. This is done by the handling semantics ( i − → ), whose rules are given in Figure 2.16. Most handling rules simply pass along the input events to their subtasks. The only interesting rules are the handling rules for editors. A valid input event to an empty editor results in a filled editor containing the new data (H-Enter), a valid input event to a filled editor updates its value (H-Update), and a valid input event to a shared editor updates the state σ such that the reference l now contains the new value (H-Change). It also returns δ = {l }, which will later be used in the fixing semantics described in Section 2.2.11.
The interaction semantics uses the handling semantics to first handle the input i, after which it uses the fixing rules to make the task ready to accept the next input. The interaction semantics are given in Figure 2.17.

Failing and internal value
A failing task ( ) stands for an impossible task. A task that is failing never has a value and never accepts input. Not just can fail, tasks with failing subtasks can also fail. For example, the pairing ( ▶◀ ) is also a failing task, and is equivalent to . To capture what tasks are failing we introduce the failing observation F , which is defined in Figure 2.18. Failing is especially useful when used in combination with the step combinator, which will be explained in the next section.

Figure 2.18: Failing observation on tasks
Internal value (■ v) can be used to output the value v as result. Unlike editors, it accepts no input, and thus the value of ■ v will always remain the same. Figure 2.19 gives the normalisation rules for fail and internal value.

Step
The step combinator (▶) allows the result from one task to determine the next task. We call this sequential composition. The step combinator expects a task t of type Task τ 1 on the left hand side, and a continuation e on the right hand side, which is a function from from τ 1 to a successor task of type Task τ 2 . Before we can define the normalisation rules of step, we need to have a way to determine the value of a task. For this, we introduce the task observation V. Given a task t : Task τ and its current state σ , this function returns the task's value v of type τ . It is also possible that a task's value is undefined, in which case we write V(t, σ ) = ⊥. The definition of V is given in Figure 2.20.
Steps are guarded. A step can only be taken if two conditions are met: (1) the task on the left-hand side has a value, and (2) the evaluation of the

Figure 2.20: Value observation on tasks
continuation on the right-hand side with this value does not fail. The normalisation rules for step are given in Figure 2.21. There are three cases: either the first condition fails, and the step remains guarded (N-StepNone); or the first condition is met, but the second condition fails, and the step remains guarded (N-StepFail); or both conditions are met, and the step can proceed (N-StepCont). Note that the last two rules use evaluation in the host language to compute the successor task. The result of this evaluation is only used when the step can be taken successfully (N-StepCont), and discarded otherwise (N-StepFail).

Example 2.2 (Coffee machine).
Consider the following task-program: This program describes a coffee machine that can either serve coffee or tea.
Coffee is served when one coin is inserted, and tea is served when two coins are inserted. For any other number of coins, the step remains guarded, which means that the coffee machine returns nothing and waits until a correct number of coins is inserted.

Transform
The transform combinator (▲) maps a function over a task. It takes a function of type τ 1 → τ 2 on the left-hand side, and a task of type Task τ 1 on the right-hand side, resulting in a task of type Task τ 2 . The normalisation rule of the transform combinator, given in Figure 2.22, does not actually do anything. If we want to apply the function and use the result, we would need to use the transform combinator in combination with the step combinator. This would allow us to extract the value of the transform task by using the task observation V, which in case of the transform combinator, is defined as V(e ▲ t) = v ′ when V(t, σ ) = v and ev ↓ v ′ . So, if the task on the right-hand side has a value v, and applying the function e to v results in the value v ′ , then the transform combinator has the value v ′ . This program describes a traffic light whose light is initially turned off, but given the right input, it can either become red or green. So long as no input is given, the transform task on the left-hand side of the step combinator has no value, and the step remains guarded. Once an input is entered, transform returns the value Green if True was entered, and Red if False was entered, upon which the step proceeds and displays the result.

Parallel
Pairing (▶◀) combines the result of two tasks, but only if both branches have a value. If the left task is of type τ 1 , and the right task is of type τ 2 , then their pairing is of type τ 1 × τ 2 . However, if one or both branches have no value, then the resulting task also has no value. Choosing (♦) chooses one of two branches. This combinator is left-biased: it returns the leftmost task that has a value. If neither task has a value, then the resulting task also has no value. The normalisation rules for pairing and choice are given in Figure 2.23. See also Figure 2.20 for the definition of V for pairing and choice. These combinators allow user to work on two tasks in parallel, but unlike the name suggests, parallel does not mean that there is non-determinism. The order of execution is determined by the order of user inputs send. Instead, parallel here means that the order in which we execute the tasks, and their subtasks, does not matter.

Example 2.4 (Breakfast).
Let us consider the following task-program, which makes use of both parallel combinators: let make : τ → Task τ = λx. νn. ⊠ n Unit ▶ λy. ■x in let makeBreakfast : Task (Drink × Food) = ((make Tea ♦ make Coffee) ▶◀ make Egg) ▶ eatBreakfast This program describes a simple workflow for making breakfast. Breakfast consists of something to drink (tea or coffee), and something to eat (eggs). The drink and the food are prepared in parallel (▶◀), which means that the order in which they are made does not matter. For the drink, users have a choice (♦) whether they want tea or coffee with their breakfast. For the food, users will always make an egg. We will use the function make to simulate that the user must first perform an action (i.e. send user input) before an item is prepared and the task has a value. Only when both the drink and food are ready, can the step be taken and can we enjoy our breakfast.

References
References model shared data sources in top. They provide a way for tasks to share information across control flow, and allow multiple users to simultaneously view or edit the same data. We have already seen shared editors, which allow users to modify a shared value, upon which the result immediately becomes visible to all other tasks interested in them. To create a reference to an expression e, we write ref e. Reference creation is a task, which upon normalisation, adds the reference to the state and results in a task whose value is the newly created location l pointing to the value b.
Reference assignment (e 1 := e 2 ) allows the system to assign a new value to a reference. It expects a location l of type Ref β on the left-hand side, and an expression of type β on the right-hand side. Upon normalisation, reference assignment saves the location's new value in the state and returns the unit value. Because the l's value has been changed, it also returns the set δ = {l }. See Figure 2.24 for the normalisation rules of reference creation and assignment. This approach to references follows the one presented by Pierce [3], except that in our case, reference are lifted into the task domain.

Figure 2.24: Normalisation rules for references
There is still one problem that needs to be solved when using references. Consider the following example: This program reduces to the following task after normalisation: (νn. ⊞ n l ▶ λx : Bool.if x then e else ) ▶◀ ■⟨⟩ where σ = {l → True}. However, this task is not fully normalised. This happens because the normalisation rule N-Pair for pairing (t 1 ▶◀ t 2 ) normalises its operands from left to right. Therefore, when the left task is normalised, the reference l is still set to False, and thus the step remains guarded. Only after normalisation of the left task is done, is the right task normalised, which sets the reference l to True. We would need to normalise the task-program a second time to fix this. To prevent this problem, we keep track of the set of references whose value has been changed while normalising or while handling input. This is the set δ that is returned after normalisation and input handling. There are two instances where this happens. Either the reference is updated by the system through reference assignment (N-Assign), or the reference is updated by the user through a shared editor (H-Change). This motivates the fixing semantics presented in Figure 2.26. This semantics makes use of another task-observation function W, which returns all references that are currently being watched by a shared editor inside a task. Its definition is given in Figure 2.25.

Figure 2.25: Watching observation on tasks
Fixing rules are of the form t, σ, δ ⇒ t ′ , σ ′ . They make sure that normalisation is applied until the task is truly reduced. When given a task, the current state, and the set δ , the fixing semantics will first apply normalisation on the given task and state. If it turns out that after normalisation, the resulting task watches a reference that has been changed meanwhile (either by the normalisation or by the initial set δ ), then normalisation must be applied again. This is captured by the rule F-Loop. This process is repeated until the task is truly normalised, as determined by the rule F-Done. This rule also ensures that the resulting task t ′′ is in ν -standard form, which is required before handling user input. Recall that the interaction semantics makes use of the fixing semantics. This way, the semantics of top ensures that interaction with the environment can only take place after all updates to shared data sources are fully processed.

Examples
This section will present two larger example-programs in top. Example 2.5 models a stopwatch, which shows how shared editors can be used to create timed tasks. It also shows an example of a higher-order task, and it demonstrates the use of the transform, step and choice combinators. Example 2.6 models a simple multi-user flight booking system, which shows how multiple users can work in parallel on the same shared data source. It demonstrates reference creation, assignment, and the step and parallel combinators.

Example 2.5 (Stopwatch).
Shared editors can also be used to represent sensors or clocks. For example, we can represent the current time as a shared editor νn. ⊞ n time. While sensors and clocks are not explicitly modelled in top, we can assume that they exists as external users which periodically send update events to the system. By using shared editors as clocks, we could write a task-program that reacts to a timeout: if now ≥ start + m then t else in 5 let stopwatch : Task String = ((λs. 1000 × s) ▲ νn. ⊠ n Int) ▶ λm. 6 (νn. ⊠ n Unit ▶ λx.■ "Stopped") ♦ wait m (■ "Done") The wait function (lines 1-4) is an example of a higher order task. It takes an integer m and a task t as arguments. The first step is immediately taken, so that the variable start holds the initial time (line 2). The next step will

EXAMPLES CHAPTER 2. TOPHAT
remain guarded until m milliseconds have passed, after which the task t is returned (line 3-4). We use this function to define a stopwatch in top. Suppose that the user should enter the number of seconds s for which the stopwatch should run. Because we defined wait on milliseconds, we use transform (▲) to convert seconds to milliseconds (line 5). After the user enters a value, the step is taken, which starts the task wait m (■"Done") (line 6). This task displays "Done" on the screen after m milliseconds, but so long as this task is still running, it has no value. We give the user a choice (♦) to interrupt the stopwatch by sending an input event to the task on the left-hand side of the choice combinator (line 6). If done so before m milliseconds have passed, the left-hand side of the choice combinator will have a value before the right-hand side, and it displays "Stopped" on the screen. When booking a flight, passengers should first enter their name and age into the system (line 5). Only when they enter a valid name and are at least 18 years old (line 2), are they allowed to proceed. Next, they have to choose how many tickets they want to buy. We create a shared reference freeSeats to keep track of how many seats are still available, and set its initial value to 42 (line 1). A user is only allowed to buy a certain amount of tickets if it does not exceed the number of tickets available. We can get the current value of freeSeats by using a shared editor. Because we want to get this value at the same moment as the user enters the amount of tickets he wants to buy, we set these two editors in parallel (line 3). If all went well, the system updates the value of freeSeats, and displays the passenger and the amount of tickets bought (line 4).
The parallel combinator (▶◀) allows multiple bookFlight instances to run in parallel (line 7). This way, multiple users can book tickets at the same time. Their input events can interleave, and the order of execution is determined by the order of input events.

Semantic equivalence in TopHat
In this chapter, we will examine when two top-programs are semantically equivalent. Let us first consider what it means in general for two programs to be semantically equivalent. According to Sewell, a "good" definition of semantic equivalence should satisfy the following properties [5]: (i) Programs that result in observably-different values must not be equivalent. (ii) Programs that terminate should not be equivalent to programs that do not terminate. should relate as many programs as possible subject to the above properties.
It should be obvious that the first three properties are desirable. The fourth property about congruence states that if two programs e 1 and e 2 are semantically equivalent, then we should be able to use e 1 and e 2 interchangeably within any program without changing its meaning. Finally, the last property ensures that is not just the empty relation.
We will keep these properties in mind when giving our definition of semantic equivalence. In Section 3.1, we will give a definition for the semantic equivalence of two expressions in the host language. The next section, Section 3.2, will look at the semantic equivalence of two tasks. Finally, Section 3.3 will present a set of properties that we believe hold true for top-programs with our definition. We will use the symbol ≃ for the semantic equivalence of two expressions in the host language, and the symbol for the semantic equivalence of two tasks.

Expression equivalence
Before we will consider semantic equivalence of tasks, we will first look at semantic equivalence of expressions in the host language. We will start with an example. Consider the following two expressions: It should be obvious that these two functions are equivalent: they both return the absolute value of their argument. Therefore, we should be able to use them interchangeably within any top-program without changing its behavior. So even though the functions e 1 and e 2 are different, we will never detect a difference between them when they are being used within a topprogram, because for all possible arguments, e 1 and e 2 evaluate to the same result. So, when deciding if two expressions in the host language are equivalent, it is not enough to just look at the resulting value after evaluation. We need to consider all contexts that an expression can be used in. This leads to the definition of contextual equivalence. Pitts defines contextual equivalence informally as follows [4]: Two phrases of a programming language are contextually equivalent if any occurrences of the first phrase in a complete program can be replaced by the second phrase without affecting the observable results of executing the program.
This kind of equivalence is also called operational, or observational equivalence. To formally define such a notion of contextual equivalence for a given programming language, we must answer two questions: "What is a complete program?", and: "What are the observable results?". Depending on the answers to these two questions, this can result in different definitions of semantic equivalence for the same programming language [4].
For expressions in top, we answer these two questions as follows: we will consider an expression in the host-language a complete program if it does not contain any free variables, and the only observation we are interested in is the resulting value after evaluation. We also need a way to substitute an expression in a program by another. For this, we use the notion of a program context. A context C[−] is a complete program that can contain "holes", denoted by the symbol '−', which can be filled. We write C[e] for the expression that results from replacing all occurrences of − in C by e. Figure 3.1 gives the context grammar for expressions. For the definition of expression equivalence, we will only quantify over all contexts of basic type β, because we can only observe the equivalence of two basic values. If we would allow all types, then we would have the same problem as introduced at the beginning of this section. Because then the context C can also be a Pretasks P ::= νn.C | ⊠ n β | □ n C | ⊞ n C -label allocation, editors Actually proving that two expressions are contextually equivalent is hard, as we would need to quantify over all contexts. That is, we would need to consider all possible ways that a program can use an expression. However, showing that two expressions e 1 and e 2 are contextually inequivalent is straightforward. All we have to do is find one context C[−] : β such that

Task equivalence
For expression equivalence, we needed contexts to determine the equivalence of two lambda functions, whose results can only be observed after evaluation. For tasks however, we do not need contexts to view their results. A task's value can be determined at any point during execution, whereupon it either has a value, or it is undefined. On the other hand, tasks do allow user interaction, and depending on what inputs are send, the resulting task may be different. So while a lambda function can produce a different result depending on its arguments, so can a task produce different results depending on what inputs are send to it. So in a sense, for tasks, the "contexts" are user input.
The first property at the beginning of this chapter states that programs that result in observably different values must not be equivalent. Before we look at observations however, we should first fully normalise the tasks whose equivalence we want to determine. Normalisation keeps track of a state σ . For semantic equivalence, we do not want that normalisation results in two different states. So, given two tasks t 1 , t 2 : Task τ , we want that for all states σ , normalisation of t 1 and t 2 end in the same state σ ′ : We use the fixing semantics ( ⇒ ) to ensure that the tasks are fully normalised. After normalisation, we need to decide what observations t ′ 1 and t ′ 2 must have in common for them to be considered equivalent. So let us recall what observations can be made on tasks. There is the value function V which returns the value v of a task, or ⊥ if it is undefined; there is the failing function F which returns whether a task is failing; and there is the inputs function I which returns the set of all possible input events that a task accepts. The value and failing functions are used in the normalisation rules of top, and different observations for these functions can result in different derivation rules being triggered. Therefore, we can say that tasks for which the value or failing function return a different result must not be semantically equivalent. Similarly, tasks whose inputs function I return a different set of input events can also not be semantically equivalent, because that would mean that the types of interaction that can be done with the tasks are different. Recall that by the fixing rules, presented in Figure 2.26, t ′ 1 and t ′ 2 should be in ν -standard form. Recall also that input events should be labeled by the same label as the editor it is meant for. So if we require that I(t ′ 1 ) = I(t ′ 2 ), then this also implies that t ′ 1 and t ′ 2 must have the same label names for all extruded ν -expressions and their editors. If this is not possible by the congruence rules, then we can never have that I(t ′ 1 ) = I(t ′ 2 ), and the tasks cannot be semantically equivalent.

TASKS CHAPTER 3. SEMANTIC EQUIVALENCE
Given these observations, we say that at least the following property must hold: For any of these task observations, we can distinguish two cases: either a task fails or does not fail, it either has a value or its value is undefined, and it either accepts input or it accepts no more input. Based on these case distinctions, we will say that a task is in either one of five states after normalisation. These task states, and some examples, are shown in Figure 3.2. The next subsections will describe each task state into more detail.

F V I Task state
Examples

Failing tasks
A failing task t is a task for which the failing function F (t) yields true. In the original top paper, theorem 6.5 states that a task fails if and only if it accepts no more user input [6]. However, with the introduction of internal value in TopHat Next [7], this is no longer the case, because ■ e does not fail, and neither does it accept user input. What we can still say however is that if a task is failing, then we know that it has no value and accepts no more user input (Conjecture 3.4).

Definition 3.3 (Failing task).
We call a task t : Task τ failing iff F (t).

Conjecture 3.4.
For all failing tasks t and states σ , we have that V(t, σ ) = ⊥ and I(t) = .
Once a task fails, it will always remain failing, because by Conjecture 3.4, no more user interaction is possible, and by assumption, the task is already fully normalised. Failing can thus be regarded as one type of termination, and we will consider all tasks that fail to be equivalent. Hence, we will say that the tasks , ▶◀ , ♦ , λx .x ▲ , ▶ λx . ■ x, and all other failing tasks, are semantically equivalent to each other.

Finished tasks
A finished task t is a task which yields a value V(t, σ ) = v for v ⊥. This value can either be stable when no more user input is possible, or unstable when the task still accepts user input. An example of a finished task with a stable value is the task ■ 42. Because this task accepts no more user input, its value will always remain equal to 42. An example of a finished task with an unstable value is the task νn. □ n 42. This task still accepts user input, and thus its value can keep on changing. Even though both tasks yield the same value, they should not be equivalent, since one's value can be changed and the other one's value cannot.
Definition 3.5 (Finished task). We call a task t : Furthermore, we call a finished task stable iff I(t) = , and unstable iff I(t) .
A stable task can thus never be semantically equivalent to an unstable task, even if their values are (initially) the same. But just looking at the resulting values, and whether the resulting value is stable or not, is not enough to determine semantic equivalence of finished tasks. Consider for example the following tasks: These are all finished tasks with value ⟨2, 3⟩. Tasks t 1 and t 2 are both stable, and thus t 1 t 2 . We have already concluded that stable tasks cannot be equivalent to unstable tasks, so t 1 t 3 , t 1 t 4 , t 2 t 3 , and t 2 t 4 . But we will also say that t 3 t 4 , because we have that t 4 ≡ νn. νm. □ n 2 ▶◀ □ m 3 by ν -congruence, so: Meaning that the types of interaction that can be done with t 3 differ from the types of interaction that can be done with t 4 . In the case of t 3 , the user can only alter the pair in one go, whereas for t 4 , the user can partially update it. So for finished tasks, we also need to require that I(t) = I(t ′ ) if we want to conclude that t t ′ .
And yet this is still not enough. Consider the following two tasks: task t 5 ensures that its output is always positive. So if a task still accepts user input, not only do we need to look at a task's current value, but we also need to consider all values that a task can have after user interaction. We claim that a finished task will always remain finished with the same input space, only its value may change (Conjecture 3.6).
Conjecture 3.6. If t is a finished task, then for all inputs i ∈ I(t): if then t ′ is again a finished task. Moreover, if we do not allow α-conversion for ν -expressions, then we also have that I(t ′ ) = I(t).

Stuck tasks
A stuck task t is a task which does not fail, does not have a value, and does not accept user input. Such tasks are essentially "broken". Examples of stuck tasks are ■ 2 ▶ λx . , ■ 2 ▶◀ , and ▶◀ ■ 2. The first example is stuck because the right-hand side always fails, and thus the step can never be taken. For pairing, we have that both sides must fail before ▶◀ fails, and both sides must have a value before ▶◀ has a value. So, if one side fails and the other side has no value, then neither observation is true, and the task is stuck. Definition 3.7 (Stuck task). We call a task t : Task τ in state σ stuck iff ¬F (T ), V(t, σ ) = ⊥, and I(t) = .
A stuck task will always remain stuck, because no more user interaction is possible, and by assumption it is already fully normalised. Similar to failing, we will consider stuck tasks as another type of termination, and we will say that all stuck tasks are semantically equivalent to each other.

Running tasks
A running task t is a task which does not fail, does not have a value, but still accepts user input. Because there is still user interaction possible, it may be the case that with the right input, it transitions to one of the previously described task states. The simplest example of a running task is the empty editor νn. ⊠ n β, which becomes a finished (unstable) task once it receives a valid input event. There also exist running tasks that only transition to a another task state for some inputs, or for no inputs at all. For example, the tasks νn. ⊠ n Int ▶ λx . , νn. ⊠ n ▶◀ , and ▶◀ νn. □ n 2 are all running tasks which will forever remain running; and the task νn. ⊠ n Int ▶ λx . if x ≤ 2 then ■ x else will only transition to another task state for some inputs, but not for others. Definition 3.8 (Running task). We call a task t : Task τ in state σ running iff ¬F (T ), V(t, σ ) = ⊥, and I(t) .

TASKS CHAPTER 3. SEMANTIC EQUIVALENCE
To determine the equivalence of two running tasks, we therefore need to look at all possible user interactions, and check that they affect the two tasks in the same way. To do this, we need to have a way to talk about sequences of input events, instead of just single input events. We give the following definition for this: Definition 3.9 (Input sequences). An input sequence I = i 0 , ..., i n is a finite sequence of input events. Given a task t 0 : Task τ and a state σ 0 , we say that I is a valid input sequence for t 0 , σ 0 iff: with i j ∈ I(t j ), for all j ∈ {0, ..., n}. We will use the shorthand notation t 0 , σ 0 I = ⇒ * t n+1 , σ n+1 to denote the above derivation. We also consider the empty input sequence, denoted by Λ, to be a valid input sequence, and for all tasks t and states σ we have that: t, σ Λ = ⇒ * t, σ .
We will make a distinction between running tasks that forever remain running, no matter what inputs you send to it; and running tasks for which there exists at least one input sequence which "escapes", i.e. which transitions to another task state. We will call the former class looping tasks, and the latter class branching tasks, formally: Definition 3.10 (Looping and branching). We call a running task t : Task τ in state σ looping iff for all valid input sequences I : t, σ I = ⇒ * t ′ , σ ′ and t ′ is again a running task. If there exists at least one valid input sequence for which t ′ is not a running task, then we call t branching.
For branching tasks, it is possible to transition to either a finished or a stuck task state. Take for example the task νn. ⊠ n Int ▶ λx .t, which is a running task that transitions to t after a valid input event. So long as t is not failing, the step can be taken, and so t can be any non-failing task. It is also possible that for some input events, it transitions to one task state, and for others, that it transitions to a different task state. For example the task νn. ⊠ n Int ▶ λx . if x ≤ 2 then t else t ′ transitions to t for some inputs, and to t ′ for other inputs. We will claim that a running task can never transition to a failing task (Conjecture 3.11). Conjecture 3.11. If t : Task τ is a running task, then for all states σ and for all valid input sequences I , if t, σ We say that two running tasks t 1 and t 2 in state σ are semantically equivalent if for all valid input sequences I : t 1 , σ , and I(t ′ 1 ) = I(t ′ 2 ). That is, for all possible user interactions, t 1 and t 2 are not observably different. Because of Conjecture 3.11, we do not need to check whether the tasks fail or not. Figure 3.3 shows all possible task states and their transitions. In this diagram, a looping task is a task which never leaves the running state, i.e. which always takes the transition back to the running state, no matter what input is given. A branching task is a running task for which there exists at least one input sequence which will transition to either stuck, finished (stable), or finished (unstable). We claim that this state diagram is correct (Conjecture 3.12).   Figure 3.3 is correct. That is, for any two task states S and S ′ , there is a transition from S to S ′ iff there exist two tasks t ∈ S and t ′ ∈ S ′ such that t, σ i = ⇒ t ′ , σ ′ for some input i ∈ I(t) and states σ and σ ′ .

Task equivalence
In Section 3.2.1 on failing tasks, we noted that with the addition of internal value (■) in TopHat Next [7], that it is no longer the case that a task t is failing iff it accepts no more user input, as is shown by theorem 6.5 in the original top paper [6]. Were this theorem still true, then we could not have the stuck and finished (stable) states, because they contain tasks that do not fail and do not accept user input. We will therefore claim if we remove ■ from the top language as presented here, then we no longer have the stuck and finished (stable) states (Conjecture 3.13). Conjecture 3.13. If we remove internal value (■) from top then we are left with only three possible task states after normalisation: failing, running and finished (unstable). Based on the five task states, we give the following definition for the semantic equivalence of two tasks:

Properties
Now that we have a definition of semantic equivalence for top-programs, we can look at some interesting properties. We will not give formal proofs in this section. At most, we will give some informal argumentation of why we think a certain equality holds, or we will provide a counterexample to show the inequality of two expressions.

Transforming (functor)
A functor in category theory and functional programming is an object that can be mapped over. In top, we can map a function over a task by using the transform (▲) combinator. For an object F to be qualified as a functor, it needs to satisfy two laws: (1) the identity law, which says that if we map the identity function over F , then the result is again F ; and (2) the composition law, which says that mapping the composition of two functions over F is the same as mapping the two functions over F in a row. In top, we can express these laws as follows: We think that these laws hold with our definition of semantic equivalence. We argue informally that this is true for the identity law. Given any task t and state σ , we have that F (id ▲ t) = F (t) by definition (see Figure 2.18); I(id ▲ t) = I(t) by definition (see Figure 2.15); and V(id ▲ t, σ ) = V(t, σ ), because if t has no value, then neither does id ▲ t, and if V(t, σ ) = v for v ⊥, then V(id ▲ t, σ ) = id v ↓ v. A similar argumentation can be made for the composition law. We therefore believe that the Task type former in top is a functor.

Pairing (applicative/monoidal functor)
An applicative functor is a construct in functional programming, which is always also a functor. Traditionally, it needs an apply function which, given the functors F (a → b) and F a, returns F b. This function can also be expressed differently if we have a mapping and pairing function over F . In top, these are transform (▲) and pairing (▶◀). The laws that an applicative functor must satisfy can be written in top as follows: However, if we take t = for the left and right identity laws, then in both cases we have that the right-hand side is a failing task, but the lefthand side is not, because pairing requires that both sides fail before it fails itself. Therefore, we will say that the Task operation on types is not an applicative functor. We do however believe that the other two laws hold. For associativity, we need the function assoc := λ⟨a, ⟨b, c⟩⟩.⟨⟨a, b⟩, c⟩ to make sure that both sides have the same type. Lastly, the naturality law states that first pairing two tasks and then mapping two functions is the same as first mapping the functions separately and then pairing them.

Failing
When considering the failing task ( ), we can write down the following (in)equalities: We define the left and right identity for the choice combinator (♦), which say that can be cancelled out from the left-and right-hand side. For pairing, we will say that left and right pair annihilation do not hold, because F (t 1 ▶◀ t 2 ) = F (t 1 ) ∧ F (t 2 ). Pairing therefore requires that both the left-hand and right-hand side fail before their pairing fails, and thus if only one side fails, it cannot be equivalent to the failing task .
We defined all failing tasks to be semantically equivalent, thus annihilation for ▲, and left step annihilation for ▶ trivially hold. Right step annihilation does not however. Suppose that t is not a failing task, then we also have that t ▶ λx . is not a failing task, because F (t ▶ λx . ) = F (t) by definition (see Figure 2.18). Neither can we normalise t ▶ λx . , because the right-hand side fails (N-StepFail). So if t does not fail, then neither does t ▶ λx . , and we cannot conclude that a non-failing is semantically equivalent to the failing task .

Stepping (monad)
A monad is a construct in category theory and functional programming. It consists of a type constructor M; a return function that wraps any value a into the monadic value M a; and a bind function that extracts the value a from the monad M a, and uses it to produce a new monad M b. Furthermore, for M to be considered a monad, it must satisfy three laws: (1) return is a left identity for bind, (2) return is a right identity for bind, and (3) bind is associative.
Steps (▶) in top have a monadic flavor to them, and we can wonder whether the step combinator is a bind operation. So, if we consider Task to be the monadic constructor, ■ the return function, and ▶ the bind function, then we can express the three monadic laws in top as follows: Unfortunately, with our definition of program equivalence, we can show that neither the left nor the right identity laws hold. For the left identity, take for example д = λx . , then we have that д x is a failing task, because д x = (λx . ) x ↓ . However, the left-hand side ■ x ▶ д = ■ x ▶ λx . is a stuck task, because it does not fail, and the step can never be taken.
For the right identity law we can take for example t = νn. ⊠ n Int. If we send, for example, the input event E n 42 to both sides, then the left-hand side normalises to ■ 42, while the right-hand side becomes νn. □ n 42. As we have already seen, these two tasks are not semantically equivalent, because the value of ■ 42 is stable and cannot be changed anymore, while the value of νn. □ n 42 is unstable and can be updated through user input.

Chapter 4 Conclusion
In this thesis, we looked at the formal language top, which defines an operational semantics for reasoning about task-oriented programs. In Chapter 3, we gave a definition for the semantic equivalence of two top-programs. We split this definition into two classes: expression equivalence, and task equivalence. For task equivalence, we showed that a task can be in either one of five states after normalisation, and for every two tasks in the same state, we defined what it means for them to be semantically equivalent. We also noted that for task states that still accept user input, it is important to take the interactive setting of top into account, and compare how both tasks react to user input. Finally, in Section 3.3, we presented a set of properties that we believe hold true for top-programs. We showed that the Task type former in top is neither a monad nor an applicative functor, and although we did not prove it formally, we speculated that it is a functor.

Future work
We presented a number of conjectures whose proofs were out of scope for this thesis. To get more confidence in our results, it would be nice to prove these conjectures formally. It would also be nice to prove formally that Definition 3.14 satisfies the five desired properties of semantic equivalence presented at the beginning of Chapter 3. Likewise, the properties in Section 3.3 are missing formal proofs.
For our analysis of top, we left out some task constructs from the toplanguage as presented in TopHat Next [7]. We also noted in Section 3.2.1 that the addition of internal value (■) in TopHat Next changed some properties of top as it was originally presented. Furthermore, we speculated in Conjecture 3.13 that without ■, we are left with only three task states after normalisation. It might therefore also be interesting to research what effects adding or removing certain task constructs has on our definition of semantic equivalence, and the properties presented in Section 3.3.