In this section we present the application of previously introduced modeling and verification framework for developing distributed railway signalling protocol. In Sect. 2 we defined protocol’s requirements (Step 1), thus following subsections focuses on formal methodology aspects.
4.1 Step 2. Formal Protocol Model Development in Event-B
We apply the Event-B formalism to develop a high-fidelity functional model and prove the protocol functional correctness requirements. We follow the modelling process presented in Sect. 3.2. Important to note that the protocol model was redeveloped multiple times as various deadlock scenarios were found with ProB animator and model-checker. Below, we overview the final (verified) model.
Modelling was started by creating an abstract model context which contains constants, given sets and uninterpreted functions. In the abstract context, we introduced three (finite) sets, to respectively represent agents (\(\mathsf {agt}\)), resources (\(\mathsf {res}\)) and objectives (\(\mathsf {obj}\)). The context also contains an objective function which is a mapping from objectives to a collection of resources (
) and an enumerated set for agents status counter.
The dynamic protocol parts, such as messages exchanges, are modelled as variables and events computing next variable states and contained in a machine. According to the proposed model development process, the initial machine (abstract) should summarise the objective of protocol, which is an agent completing an objective (locking all necessary resources). To capture that, the abstract protocol machine contains two events, respectively modelling an agent locking and then releasing a free objective (
). The abstract model is refined by mostly modelling communication aspects of the distributed signalling protocol and for that we use a backward unfolding style where the next refinement step introduces preceding protocol step. Below, we overview the refinement chain and properties we proved at that modelling stage.
Refinement 1 (Abstract ext.). In this refinement we introduce resources into the model and now an agent tries to fulfill the objective by locking resources. Previous two events (lock/release) are now decomposed to two for each and capture iterative locking and releasing of resources.
Refinement 2. The abstract models are firstly refined with \(\mathsf {stage_2}\) part of the protocol. In the refinement, \(\mathsf {r\_2}\), we introduced \(\mathsf {lock}\), \(\mathsf {response}\) and \(\mathsf {release}\) messages and associated events into the model. In this step we also demonstrated that the protocol \(\mathsf {stage_2}\) ensures safe distributed resource reservation by proving an invariant. The invariant states that no two agents will be both at resource consuming stage if both requested intersecting collections of resources.
Refinement 3. Model \(\mathsf {r\_3}\), is the bridge between protocol stages \(\mathsf {stage_1}\) and \(\mathsf {stage_2}\) and introduces two new messages \(\mathsf {write}\) and \(\mathsf {pready}\) into the model.
Refinement 4. The final refinement step - \(\mathsf {r\_4}\) - models \(\mathsf {stage_1}\) of the distributed protocol which is responsible for creating distributed lanes. Remaining messages \(\mathsf {request}\), \(\mathsf {reply}\), \(\mathsf {srequest}\) and associated events are introduced together with the distributed lane data structure. In this refinement we prove that distributed lanes are correctly formed (req. \(\mathsf {SAF_{3\text {-}4}}\)).
4.2 Step 2: Proving Functional Correctness Properties in Event-B
As shown in Sect. 2.2 (Scenarios 1 - 2) high-level system’s requirements can only be met if an agent invariably and correctly forms a distributed lane. The probabilistic lane forming eventuality (\(\mathsf {LIV_3}\)) is discussed separately while in the following paragraphs we focus on the proof regarding requirements \(\mathsf {SAF_{3\text {-}4}}\).
\(\mathsf {SAF_3}\) is required to ensure that agent’s resource objectives are not satisfied or satisfied on full. The model addresses this via event guards restricting enabling states of the event that generates an outgoing \(\mathsf {write}\) message. To cross-check this implementation we add an invariant that directly shows that \(\mathsf {SAF_3}\) is maintained in the model. For illustrative purposes we focus on details of verifying a slightly more interesting case of \(\mathsf {SAF_4}\) and assume that \(\mathsf {SAF_3}\) is proven.
Requirement \(\mathsf {SAF_4}\) addresses potential cross-blocking deadlocks or resource double locking due to distributed lane overriding. The strategy is to prove the requirement is to show that agents that are interested in at least one common resource (related) always form distributed lanes with differing indices. We start by assuming that agents only form distributed lanes if all received indices are the same (proved as \(\mathsf {SAF_3}\)). Then, if a resource (or resources) shared between any two related agents send unique promised pointer values to these agents, these indices will be distributed lane deciders as all other indices from different resources must be the same to form a distributed lane. Hence, to prove \(\mathsf {SAF_4}\) it is enough to show that each resource replies to a \(\mathsf {request}\) or \(\mathsf {special \, request}\) message with a unique promised pointer value.
To prove that all resources replies to a \(\mathsf {request}\) or \(\mathsf {special \, request}\) message with a unique promised pointer value, we firstly introduced a history variable \(\mathsf {his_{ppt}}\) of type
into our model. The main idea behind the history variable was to chronologically store the promised pointer values sent by a resource. We also introduced a time-stamp variable \(\mathsf {his_{wr}}\) of the type
to chronologically order the promised pointer values stored in the history variable.
After introducing history variables, we modified events \(\mathsf {resource\_reply\_general}\) and \(\mathsf {resource\_reply\_special}\), which in the protocol update the promised pointer variables, by adding two new actions (see Fig. 3). The first action \(\mathbf {act_4}\) updates the history variable with the promised pointer value (\(\mathsf {ppt(res)}\)) that was sent to the agent at the time stamp (\(\mathsf {his_wr(res)}\)). The second action, \(\mathbf {act_5}\), simply increments resource’s \(\mathsf {res}\) time-stamp (\(\mathsf {his_wr(res)}\)) variable.
Action \(\mathbf {act_4}\) updates a history variable for a resource \(\mathsf {res}\) with the current write stamp and promised pointer (\(\mathsf {ppt(res)}\)) value sent. The next action \(\mathbf {act_5}\) simply updates the resource’s write stamp. We can then add the main invariant to prove (\(\mathbf {inv\_saf\_4}\)) which states that if we take any two entries \(\mathsf {n1, n2}\) of the history variable for the same resource where one is larger, then that larger entry should have larger promised pointer value.
To prove that \(\mathsf {resource\_reply\_\{general, special\}}\) preserve \(\mathbf {inv\_saf\_4}\), the following properties play the key role: (1) the domain of \(\mathsf {his_{ppt}}\) (i.e., ‘indices’ of \(\mathsf {his_{ppt}}\)) is \(\{0, \ldots , \mathsf {his_{wr}} - 1\}\), (2) \(\mathsf {his_{ppt}}(\mathsf {his_{wr}}-1) < \mathsf {his_{ppt}}(\mathsf {his_{wr}})\). Property (2) holds because \(\mathsf {his_{ppt}}(\mathsf {his_{wr}})\) is the maximum of promised pointer (\(\mathsf {ppt}\)) and special request slot number and promised pointer is incremented as \(\mathsf {resource\_reply\_\{general, special\}}\) occurs. We also specified these properties as an invariant (\(\mathbf {inv\_his\_ppt}\)) and proved they are preserved by the events which helped to prove \(\mathbf {inv\_saf\_4}\).
Proof Statistics. In Table 1 we provide an overall proof statistics of the Event-B protocol model which may be used as a metric for models complexity. The majority of the generated proof obligations were automatically discharged with available solvers and even a large fraction of interactive proofs required minimum number of steps. We believe that a high proof automation was due to modelling patterns [23] use and SMT-based verification support [6, 17].
Table 1. Event-B protocol model proof statistics
4.3 Step 2: Proving Liveness (req. \(\mathsf {LIV_3}\)) with PRISM
In this subsection, we discuss stochastic model checking results with which we intend to prove level that \(\mathsf {LIV_3}\) requirement is preserved. In particular, we focus on showing that \(\mathsf {LIV_3}\) requirement is ensured in Scenario 2 (Sect. 2.2).
In order to demonstrate that \(\mathsf {LIV_3}\) requirement holds in Scenario 2 (Sect. 2.2) we used \(\mathsf {stage_1}\) protocol’s skeleton PRISM model to replicate Scenario 2. In this experiment we were interested in observing the effects a promised pointer offset has on an probability of agent forming a distributed lane while the upper limit of the promised pointer is increasedFootnote 3 (\(\mathsf {n}\) in Scenario 2). Early experiments showed that verification would not scale well (several hours for a single data-point) if we would increase the number of resources and agents above two resources and three agents (each agent trying to reserve both resources) so we kept these parameters constant.
For each scenario, we would run a quantitative property: \(\mathsf {P \,= \,? \, [F \, dist_0 \, > \, \text {-}1] }\) which asks what is the probability of an agent negotiating a distributed lane until the upper promised pointer limit is reached. The three curves (red, green and violet) in Fig. 4 show the effect a promised pointer offset has on negotiation probability as queue depth is increased. Results suggest that increasing the offset reduces the probability of negotiating a distributed lane as queue depth is increased, but the probability still approaches one as the number of rounds is increased (Fig. 4).
To further see the effects of the offset, we considered a different experiment where the same quantitative property would be run when the number of possible renegotiations value is kept constant and offset is increased (light blue plot). Results indicate that offset has only effect until a specific threshold and after that the probability of agent negotiating a distributed lane is not affected by the offset. These results suggest that the situation in Scenario 2 does not violate \(\mathsf {LIV_3}\) requirement as distributed lanes can be negotiated.
4.4 Step 3: Analysing Performance
The goal of this part is to study the protocol performance under various stress conditions and thus provide assurances of its applicability in real life situations. To build simulation, we simply capture protocol’s \(\mathsf {stage_1}\) behaviour using a program. We are also able to obtain bounds on the number of messages required to form lanes in different setups. This can be directly translated into real-life time bounds on the basis of point to point transmission times.
Simulation Construction. Simulation is setup as a collection of actors of two types - agents and resources - and an orchestration component observing and recording message passing among the actors. A message is said to be in transit as soon as it is created by an actor. Every act of message receipt (and receipt only) advances the simulation (world) clock by one unit. Hence, any number of computations leading to message creation can occur in parallel but message delivery is sequential. To model delays we define a function that probabilistically picks a message to be delivered among all the messages currently in transit. A special message, called skip, is circulated to simulate idle passage of time. This message is resent immediately upon receipt by an implicit idle actor.
Let \(\mathbb {M}\) be set of all messages that can be generated by agents and resources. Also, let \(\mathsf {skip} \notin \mathbb {M}\) denote the skip message and \(\mathbb {M}'= \mathbb {M} \cup \{\mathsf {skip}\}\). By its structure, set \(\mathbb {M}'\) is countable (each message identified by unique integer) and one can define a measure space over \(\mathbb {M}'\). Let D signify the probability that some message \(m \in M \subseteq \mathbb {M}'\) from message pool M is selected for reception. We shall define D via the current message pool, the attributes of m such its source, destination, time stamp and protocol stage, and the world time: \(D = D(M, m, t) = D(M, m.s, m.d, m.c, m.o, t)\). Here M is the set of available message, m.s and m.d are the message source and destination agent or resource, m.c is the message type (e.g., \(\mathsf {WRITE}\)), m.o is the message timestamp (the point of its creation) and t is the world clock. Defining differing probabilities D we are able to address most scenarios of interest.
Uniform Distribution. With
the simulator picks a message from M using a uniform distribution. It is an artificial setting as the time in transit bears no influence over the probability of arrival. Counter-intuitively, the said probability may decrease with the passage of time when new messages are created quicker than they are delivered. The skip message has equal probability with the rest so the system “speeds up” when M is large. The plots in Fig. 5 shows how the protocol performance changes when the number of resources (Resource line), agents (Agent lines), and resources an agent attempts to acquire (Agent goal) increase. We plot separately time to form all lanes and any first lane. The values plotted are averaged over 10000 runs.