1 Introduction

Distributed computing systems provide many important services, such as electronic banking, information and knowledge sharing, and social networking. They are enablers for innovation; for instance, blockchain technology is based on massively distributed computing. Since our societies increasingly depend on the services offered in this manner, it is important to ensure their performance, dependability, and correctness. The purpose of performance evaluation is to investigate and optimise the amount of useful work being accomplished. Dependability evaluation is concerned with assessing service continuity by means of measures such as reliability and availability. The evaluation of correctness—usually called formal verification—focusses on proving that the service delivered satisfies a formal specification of its behaviour. Usually, all of these techniques are based on a model of the system, which is an abstract representation of the system’s behaviour.

Markov Chains. In numerical performance and dependability evaluation, by far the most prominent models used to represent the temporal dynamics of a system are Markov chains [38]. In this model family, the system is supposed to occupy a state at any moment in time, with the set S of states (the state space) being finite or countably infinite. Markov chains come in two flavours, dependent on whether the time domain \(\mathbb {T}\) is considered to be discrete (\(\mathbb {T}=\mathbb {N} = \{\,0, 1, \dots \,\}\)) or continuous (\(\mathbb {T}=\mathbb {R} _+=[0, \infty )\)). The dynamics of a discrete-time Markov chain (DTMC) is determined by a mapping from states to probability distributions over (successor) states. For instance, if state s is mapped to probability distribution \(\mu \), then the system once occupying state s is understood to jump to state \(s'\) with probability \(\mu (s')\) in one time step. Notably, the probability is assumed to be independent of any further information (such as any past behaviour) apart from the state identity of s. This is known as the Markov (or memoryless) property. A continuous-time Markov chain (CTMC) adheres to this property, too, but it now needs to be interpreted in stochastic time, i.e. on a continuous time line where probability mass flows continuously between states. For CTMC, the Markov property implies that neither the past history nor the time already spent in the current state s influences the flow of probability into some state \(s'\). Instead, it is governed by a time-independent rate \(\lambda \), a positive real value (or zero if no flow exists). Thus the overall behaviour of a CTMC is determined by a mapping from state pairs to rates in \(\mathbb {R} _+\). CTMC are arguably better fit to the nature of distributed computing [5], where it is difficult to assume a common discrete time base. The time spent in state s before jumping to another state \(s'\) is usually called the residence time (or sojourn time) in s. Residence times are geometrically distributed in DTMC, and exponentially distributed in CTMC.

Labelled Transition Systems. In formal verification, other models appear: State-transition diagrams, automata, and similar formalisms describe the dynamic behaviour of systems here. They often appear in the specific form of labelled transition systems (LTS). A transition system consists of a set of states S and a set of possible state changes. The latter is given as a binary relation on states, i.e. a subset of the cross product \(S \times S\). Intuitively, a pair of states is in this relation if it is possible to jump from s to \(s'\) in a single step. In LTS, state changes are associated with occurrences of actions. A state change from s to \(s'\) then implies the occurrence of a specific action a, which labels the transition—thus we have an LTS. If multiple transitions are possible in a state, then the decision of which one to take is usually interpreted as being nondeterministic. Nondeterminism is especially useful to represent concurrency, a crucial aspect of distributed computing systems. If two systems run concurrently and independently, this is best represented as the nondeterministic interleaving of their individual steps. LTS can thus be endowed with parallel composition operators to model concurrency and interaction of component LTS [6, 40, 44]. With further operators, this is convenient for a compositional modelling style, where the behaviour of components is the result of compositions of smaller building blocks.

Model Checking. Within the spectrum of techniques used in formal verification, model checking is an automated model-based technique to assess whether the possible system behaviours satisfy a property describing the desirable behaviour [3]. Typically, properties are expressed in temporal logics such as LTL or CTL. Model checking usually involves constructing an in-memory representation of the (part of the) state space (relevant to assess the property). It thus gives definitive answers, but faces the state space explosion problem. In the past decades, model checking has been extended to treat aspects such as discrete probabilities and stochastic time. It has become apparent that a joint consideration of performance, dependability and correctness is both possible and worthwhile [2].

This Paper. The purpose of this tutorial paper is to provide a gentle introduction to working with a mathematical formalism integrating the modelling aspects discussed above. We focus especially on the specification and modelling of real systems. The formalism we introduce is called Markov automata (MA), and it can best be described as an orthogonal and compositional superposition of DTMC, CTMC, and LTS. MA have been coined in [22, 23]. They are expressive enough to give a semantics to generalised stochastic Petri nets (GSPN) in their full generality [20]. The theoretical properties of MA are the subject of the Ph.D. thesis of Christian Eisentraut [19]; a process-algebraic perspective is covered in the Ph.D. thesis of Mark Timmer [50]. Various algorithmic analysis methods for Markov automata have been developed over the past decade [8, 9, 14,15,16, 21, 28, 29, 36, 37, 52].

Using the mathematical formalism of MA directly to build complex models is, however, cumbersome. We instead need a higher-level modelling language. Aside from parallel composition, such languages typically provide variables over finite domains that can be used in expressions to e.g. enable or disable transitions, allowing to compactly describe very large models. In this paper, we use Modest  [30] to construct MA models. Rooted in process algebra, Modest provides various composition operators that allow large models to be assembled from smaller, easier-to-understand components. After a formal definition of MA, parallel composition, and various types of properties (that we may want to compute for a given MA model) in Sect. 2, we introduce the basics of Modest in a step-by-step fashion in Sect. 3. We compare it to alternative languages with respect to its succinctness, expressivity, and readability. We then guide the reader through the modelling and analysis of two very different applications: we optimise an attack on Bitcoin in Sect. 4, and we evaluate the performance of a small, but intricate resource-sharing queueing system in Sect. 5 with the Modest Toolset. Algorithmic aspects of the analysis of MA with the Modest Toolset are the subject of a companion paper [13].

Previous Work. Our presentation of MA in Sect. 2 is adapted and extended from [13], as is the text in Sects. 3.2 and 3.3. The Bitcoin models in Sect. 4 are inspired by [24], and the bitcoin-attack.modest model is part of the Quantitative Verification Benchmark Set [34]. The reentrant queueing system, of which we present a new Modest model in Sect. 5, was first described in [36].

Fig. 1.
figure 1

The MA family tree

Fig. 2.
figure 2

Example Markov automata

2 Markov Automata

The mathematical formalism of Markov automata provides nondeterministic choices as in LTS, discrete probabilistic decisions as in DTMC, and stochastic time as in CTMC. The relationships between these and other formalisms are visualised in Fig. 1. The combination of DTMC and LTS leads to the model family of (discrete-time) Markov decision processes [46] (MDP, or probabilistic automata [49]) where transitions of the form offer in state s a (nondeterministic) decision option (or choice option) labelled by action a that is followed by a probabilistic decision of where to jump according to probability distribution \(\mu \). The conceptually closest model in continuous time is that of continuous-time MDP [46] (CTMDP), where action-labelled transitions are of the form with e mapping states to rates. Such a transition indicates that probability mass flows from state s to state \(s'\) with rate \(e(s')\) provided action a is chosen in state s. Markov automata instead combine MDP and CTMC in an orthogonal manner by providing two types of transitions: as in MDP, and as in CTMC. We now define Markov automata formally and describe their semantics.

Preliminaries. We write [ab] for the real interval \(\{\,x \in \mathbb {R} \mid a \le x \le b\,\}\), (ab) for \(\{\,x \in \mathbb {R} \mid a< x < b\,\}\), and analogously for half-open intervals. Given a set S, its powerset is \(2^{S} \). A (discrete) probability distribution over S is a function \(\mu :S \rightarrow [0, 1]\) such that its support is countable and \(\sum _{s \in spt ({\mu })} \mu (s) = 1\). \( Dist ({S}) \) is the set of all probability distributions over S, and \(\mu _1 \otimes \mu _2\) is the product distribution of \(\mu _1\) and \(\mu _2\) defined by . We refer to discrete random choices as probabilistic and to continuous ones as stochastic. We write \(\{\, x_1 \mapsto y_1, \dots \,\}\) to denote the function that maps each \(x_i\) to \(y_i\), and if necessary in some context, implicitly maps to 0 all x for which no explicit mapping is specified. Thus we can e.g. write \(\{\,s \mapsto 1\,\}\) for the Dirac distribution that assigns probability 1 to s.

Definition 1

A Markov automaton (MA) is a tuple where S is a finite set of states with initial state \(s_0 \in S\), A is a finite set of actions, \(P: S \rightarrow 2^{A \times Dist ({S})} \) is the probabilistic transition function, \(Q : S \rightarrow 2^{\mathbb {Q} \times S} \) is the Markovian transition function, \( rr :S \rightarrow [0, \infty )\) is the rate reward function, and \( br :S \times Tr (M) \times S \rightarrow [0, \infty )\) is the branch reward function. is the set of all transitions; it must be finite. We require that implies \( tr \in P(s) \cup Q(s)\).

We also write for and for , and omit the P and Q subscripts if they are clear from the context. In , we call \(\lambda \) the rate of the Markovian transition. We refer to every element of \( spt ({\mu }) \) as a branch of ; a Markovian transition has a single branch only (its target state). We define the exit rate of \(s \in S\) as .

Example 1

Fig. 2 shows two MA \(M_1\) and \(M_2\) without rewards. We draw probabilistic transitions as solid, Markovian ones as dashed lines. If a transition leads to a single target state, we omit the intermediate probabilistic branching node. Thus, for , we have five states in \(S = \{\, 0, 1, 2, 3, 4 \,\}\), the initial state being \(s_0 = 0\), two actions in \(A = \{\,\texttt {a}, \texttt {c}\,\}\), two probabilistic transitions in , and two Markovian transitions in , both with rate 2.

Intuitively, the semantics of an MA is that, in state s, (1) the probability to take Markovian transition and move to state \(s'\) within t model time units is \({\lambda }/{E(s)} \cdot (1 - \mathrm {e}^{-E(s) \cdot t})\), i.e. the residence time in s follows the exponential distribution with rate E(s) and the choice of transition is probabilistic, weighted by the rates; and (2) at any point in time, a probabilistic transition can be taken with the successor state being chosen according to \(\mu \). An MA thus resolves some choices in a probabilistic (the choice of successor state of a probabilistic transition, the choice among Markovian transitions) or stochastic (the choice of residence time) way, while other choices are left open as nondeterministic (the timing of probabilistic transitions, and the choice among multiple available probabilistic transitions). Due to the presence of nondeterminism, an MA itself does not induce a probability measure over its possible behaviours. We refer the interested reader to e.g. [35] for a complete formal definition of this semantics.

An MA without Markovian transitions is an MDP; it is a DTMC if in addition P maps each state to a singleton set. An MA without probabilistic transitions is a CTMC. The co-existence of action-labelled probabilistic transitions of the form and of Markovian transitions of the form separates actions from timing. It enables parallel composition operators with action synchronisation for MA without the need to prescribe an ad-hoc operation for combining rates.

Definition 2

Given two MA \(i \in \{\,1, 2\,\}\), a finite set A of actions, and a synchronisation relation

$$\begin{aligned} sync \subseteq (A_1 \uplus \{\,\bot \,\}) \times (A_2 \uplus \{\,\bot \,\}) \times A, \end{aligned}$$

their parallel composition is where P is the smallest function that satisfies the inference rules

Q is the smallest function that satisfies the inference rules

and for all states , we have . Function \( br \) sums the values of \( br _1\) and \( br _2\) for the combinations of branches in synchronisation (third inference rule), and otherwise preserves the original branch rewards.

The first two inference rules for P allow the individual MA to proceed independently of each other if allowed by \( sync \); the third rule covers the case where both automata synchronise on a pair of actions as determined by \( sync \). The rules for Q simply state that Markovian transitions are always performed independently. An element of \( sync \) is called a synchronisation vector; we also write for vector . This form of parallel composition can be generalised to more than two automata in the straightforward way with longer synchronisation vectors. It is very flexible, allowing in particular the traditional CCS-style binary and CSP-style multi-way synchronisation patterns [40, 44] to be encoded. Originally established by Cadp [26], it is today used for MA in the Jani format [12]. We refer to a general parallel composition of several MA as a network of MA.

Example 2

Fig. 2 includes the parallel composition of the example MA \(M_1\) and \(M_2\), where we write nm for state . The two automata synchronise on the shared actions a and c, i.e. we have .

We defined MA as open systems [10]: probabilistic transitions can interact with, wait for, and be blocked by other MA in parallel composition. For verification, we make the usual closed system and maximal progress assumptions: probabilistic transitions face no further interference and take place without delay. If multiple probabilistic transitions are available in a state, however, the choice between them remains nondeterministic. Since the probability that a Markovian transition is taken in zero time is 0, the maximal progress assumption allows us to remove all Markovian transitions from states that also have a probabilistic transition. In such closed MA, we can thus distinguish between Markovian states (where \(P(s) = \varnothing \)) and probabilistic states (where \(Q(s) = \varnothing \)). The behaviour of a closed, deadlock-free MA M is defined via its paths:

Definition 3

Let M be a closed, deadlock-free MA M as above. A path \(\pi \) of M is an infinite sequence

$$\begin{aligned} \pi = s_0\, t_0\, tr _0\, s_1 \ldots \in (S \times [0, \infty ) \times Tr (M))^\upomega \end{aligned}$$

such that, for all \(i \in \{\,0, \dots \,\}\), \(Q(s_i) = \varnothing \) implies \(t_i = 0\), \( tr _i \in P(s_i) \cup Q(s_i)\), implies \(\mu (s_{i+1}) > 0\), and implies \(s' = s_{i+1}\). \(\varPi (M)\) is the set of all paths of M. We write \(\varPi _ fin (M)\) for the set of all path prefixes \(\pi _{ fin }\) ending in a state. The last state of \(\pi _{ fin }\) is denoted \( last (\pi _{ fin })\). Let . The duration \(\mathrm {dur}(\pi _{ fin })\) of a path prefix is the sum of its residence times \(t_i\). A path’s reward is

It may be \(\infty \), and is defined analogously for prefixes (where it is always finite).

A path comprises states \(s_i\), times \(t_i\) spent in \(s_i\), and transitions \( tr _i\) taken from \(s_i\) to \(s_{i+1}\). It is a resolution of all nondeterministic, probabilistic, and stochastic choices. To define a probability measure, we resolve nondeterminism only:

Definition 4

Let M be a closed, deadlock-free MA as above. A scheduler is a function \(\sigma :\varPi _ fin (M) \rightarrow Tr (M)\) s.t. \(\forall s \in S:\sigma (s) = tr \) implies \( tr \in P(s) \cup Q(s)\). We write \(\mathfrak {S}(M)\) for the set of all schedulers of M. A time-dependent scheduler is in \(S \times [0, \infty ) \rightarrow Tr (M)\); a memoryless scheduler is in \(S \rightarrow Tr (M)\). Given a time bound \(b \in [0, \infty )\), every time-dependent scheduler \(\sigma _{t}\) defines a corresponding scheduler \(\sigma \) by . Every memoryless scheduler \(\sigma _{ ml }\) defines a corresponding scheduler \(\sigma \) by \(\sigma (\pi _{ fin }) = \sigma _{ ml }( last (\pi _{ fin }))\).

We define deterministic schedulers only since randomised schedulers are in practice only needed for multi-objective problems [47]. We note that CTMDP with early schedulers [48] can be encoded as closed MA. If we “apply” a scheduler to an MA, it removes all nondeterminism, and we are left with a fully stochastic process whose paths can be measured and assigned probabilities according to the rates and distributions in the (remaining) MA. Formally, these probability measures over sets of measurable paths are built via cylinder sets; we refer the interested reader to e.g. [35] for a fully formal definition. For all of the following types of properties, we are interested in the maximum (supremum) and minimum (infimum) values when ranging over all schedulers \(\sigma \in \mathfrak {S}(M)\):

  • Reachability probabilities: Given goal states \(G \subseteq S\), compute the probability of the set of paths that include a state in G. Memoryless schedulers suffice to achieve optimal results (i.e. the maximum and minimum probabilities).

  • Time-bounded reachability: Additionally restrict to paths where the duration of the prefix to the first state in G is below a bound \(b \in [0, \infty )\). Time-dependent schedulers suffice.

  • Expected accumulated rewards: Compute the expected value of the random variable that assigns to \(\pi \) the value \(\mathrm {rew}(\pi _{ fin })\) with \(\pi _{ fin }\) being the shortest prefix of \(\pi \) with a state in G. This is well-defined if the maximum (minimum) probability to reach G is 1; otherwise, we define the minimum (maximum) expected accumulated reward to be \(\infty \). Memoryless schedulers suffice.

  • Long-run average rewards: Compute the expected value of the random variable that assigns to path \(\pi \) the value \(\lim _{i \rightarrow \infty } \mathrm {rew}(\pi _{\le i})/\mathrm {dur}(\pi _{\le i})\). Memoryless schedulers suffice.

figure a

Example 3

Consider MA \(M_1 \,{\parallel }\, M_2\) of Fig. 2 and the probability to reach state within 1 time unit. In state , we have to decide whether to choose action a or b. The optimal decision depends on the amount of time t that has passed in state . In the plot on the right, we show the probability of reaching state within the time limit (y-axis) depending on the remaining time \(1-t\) (x-axis). The blue (initially upper) line represents the reachability probability for the memoryless scheduler that always chooses a and the red (initially lower) one is for the scheduler that always takes action b. A time-dependent scheduler can make better decisions than either of these two by determining the values of t for which a results in a higher probability than b and vice-versa. The optimal scheduler thus chooses a if and only if \(1-t \le 0.63\) approximately.

We can extend MA with discrete variables: An MA with variables () is an MA like in Definition 1 that additionally contains a finite set of variables. We call its states locations, its transitions edges, and their branches destinations. Every edge additionally has a guard and every destination has a set of updates. A guard is a Boolean expression over the variables that determines whether the edge is enabled, and a set of assignments modifies the values of the variables. Tools usually work with the semantics of an in terms of an MA: The \(M_V\) corresponds to the MA M with states , each consisting of a location \(\ell \) of \(M_V\) and a valuation v that assigns a value to every variable. The transitions out of are those edges out of \(\ell \) in \(M_V\) whose guard is satisfied in v. The target state of a branch of a transition is with \(\ell '\) the target location in \(M_V\) and \(v'\) obtained by executing the destination’s assignments on v. Our parallel composition operator extends to MA with variables by using the conjunction of guards and the union of assignments for synchronising transitions. If we allow variables to be shared between , parallel composition does not distribute over semantics; we need to compose the before converting them to MA.

3 Modelling with Markov Automata

Tools for the automated analysis of MA need a syntax in which the model and the properties of interest are specified. As noted in Sect. 1, such a modelling language needs to provide a parallel composition operator (akin to the operator introduced in the previous section) such that large MA can be built from small specifications, and will typically support modelling with variables.

3.1 Modest for Markov Automata

Modest  [4, 30] is the modelling and description language for stochastic timed systems. At its core, it is a process algebra: it provides various operations such as parallel and sequential composition, parameterised process definitions, process calls, and guards to flexibly construct complex models out of small and reusable components. Its syntax, however, borrows heavily from commonly used programming languages, and it provides high-level conveniences such as loops and an exception handling mechanism. As such, Modest tends to be more verbose than classic process algebras, but also more readable and beginner-friendly. To specify complex behaviour in a succinct manner, Modest provides variables of standard basic types (e.g. bool, int, or bounded int), arrays, and user-defined recursive datatypes akin to functional programming languages. Its syntax for expressions is aligned with C-like programming languages for ease of use.

Let us now introduce the Modest language syntax step-by-step by using it to model our example MA shown in Fig. 2, starting with \(M_1\). Modest models are structured into processes, with each process consisting of declarations and a behaviour. The declarations introduce all named objects like actions, variables, exceptions, nested processes, etc., that are available for use in the behaviour and inside nested processes. A process’ behaviour defines an MA with those variablesFootnote 1. To model \(M_1\) as a Modest process, we thus start by declaring the actions and a Boolean variable to later distinguish between states 1 and 2:

figure b

The simplest behaviour in Modest is to perform a (previously declared) action:

figure c

Semantically, this behaviour represents the MA with variables shown above on the right, where the one edge has guard expression \( true \). Every location \(\ell \) is uniquely identified by a behaviour such that the MA with \(\ell \) as its initial location is the semantics of the behaviour. The checkmark is a special behaviour called successful termination that is not part of the syntax of Modest, and whose semantics is a state with no outgoing edges. It receives special treatment by several other Modest constructs. Modest also contains a construct with the same semantics but without the special treatment.

Initially, automaton \(M_1\) offers a choice between two probabilistic transitions. The construct combines multiple behaviours into a nondeterministic choice between them, thus the initial choice in \(M_1\) can be represented as follows:

figure f

The semantic effect of the construct is simply to merge the initial states of the semantics of its child behaviours, the start of each of which is indicated by . Note that both edges lead to the same location here; this is because the semantics of both behaviours and end in the identical location .

Now, in \(M_1\), the transition labelled a actually has two branches. The branching of probabilistic transitions can be represented in Modest with the construct. Since it does not create a new transition, but only defines branches, it has to be prefixed by the transition’s action:

figure l

Probabilities are specified as weights between colons , i.e. the actual probability in the semantics is calculated as the given weight divided by the sum of all weights in the construct. The assignments for every branch are specified in blocks, and they are executed atomically, so e.g. the assignment block performs an in-place swap of variables and . To create an edge labelled with a single destination and assignments u, we can omit the and just write . Observe that, in the semantics of our example above, all destinations still lead to the same location. However, the semantics of this contains two states in location : one where is \( true \), which is the target of the branch for the uppermost destination, and one where it is \( false \). We will from now on omit \( true \) guards and empty assignment sets in .

Continuing to model \(M_1\) in Modest, we now add the Markovian transitions to state 4. We need two new constructs: for sequential composition, and for rates. First, the semantics of the sequential composition construct P \(\ Q\), for two behaviours P and Q, is to first behave like P, and upon successful termination of P (i.e. upon reaching location ), behave like Q. We thus get the following:

figure x

is the predefined silent action, which does not take part in synchronisation (i.e. in a binary parallel composition, it is governed by synchronisation vectors and , but cannot occur in any other vectors). To turn the \(\tau \)-labelled probabilistic edge into a Markovian one, we simply specify rates:

figure y

Modest enforces the separation of probabilistic and Markovian transitions by requiring edges for which a rate is specified to have action . If this restriction is not met, the model is recognised as a CTMDP.

In the model above, the behaviour occurs twice. We can eliminate this duplication by moving it out of the construct. At this point, let us also introduce the construct to specify guards: instead of using to make the model deadlock in the upper destination, we use to cause the deadlock in the semantics of the . The result is:

figure af

The semantics of the on the right above is almost isomorphic to \(M_1\); the difference is that states 1 and 3 are merged since they have the same behaviour.

In Fig. 3, we show the full Modest model of the parallel composition of MA \(M_1\) and \(M_2\) of Fig. 2. It includes the model that we built for \(M_1\) above as the body of the named process . Such processes can have parameters (specified between the parentheses in the declaration, not shown here) and local variables. A process call like behaves exactly like the behaviour of , with all formal parameters being assigned the values of the actual arguments, and new variable instances created for all parameters and local variables to separate them from any other calls to . The semantics of the parallel composition construct is the n-ary parallel composition of its child behaviours, with synchronisation vectors that implement CSP-style synchronisation for all actions declared with the keyword (in this model, that is the vectors given in Example 2), and as described above for \(\tau \). The model also declares two properties for verification, P_Min and P_Max, which ask for the probability to reach state —made observable via the global variable succ, which is of bounded integer typeFootnote 2 with range \(\{\,0, 1, 2\,\}\)—within time bound B akin to Example 3. B is an open parameter for which values can be specified at verification time.

At this point, we have covered most basic constructs of Modest. There are many features not used in this small model; we will introduce more constructs in Sects. 4 and 5. The interested reader also finds additional Modest MA models in the Quantitative Verification Benchmark Set (QVBS, [34]) at qcomp.org.

Fig. 3.
figure 3

Modest model for \(M_1 \,{\parallel }\, M_2\)   

Fig. 4.
figure 4

MAPA process algebra

Fig. 5.
figure 5

Prism dialect supporting MA

Fig. 6.
figure 6

Imca state space format

3.2 The Modest Toolset

The creation and analysis of MA with Modest is supported by the Modest Toolset  [32], a comprehensive suite of tools for quantitative modelling and verification. Aside from Modest, it also supports the Jani model interchange format [12] as an input language. MA are supported in the toolset’s mosta, moconvFootnote 3, mcsta, and modes tools. mosta visualises the symbolic semantics of models (i.e. networks of before and after parallel composition as shown throughout Sect. 3.1) and is useful for model debugging. moconv transforms models between Modest and Jani, and performs syntactic rewriting and optimisations. mcsta is a fast explicit-state model checker that implements state-of-the-art MA-specific algorithms [13] and uses secondary storage to alleviate state space explosion [33]. modes  [11] is a statistical model checker with automated rare event simulation capabilities. It implements lightweight scheduler sampling [43] for nondeterministic models, including MA [17]. The Modest Toolset is written in C#, works on Linux, Mac OS, and Windows, and is freely available at modestchecker.net. All its tools share a common infrastructure for parsing and syntactic transformations. mcsta and modes build on the same state space exploration engine that compiles models to bytecode at runtime for memory efficiency and performance.

3.3 Alternative Modelling Languages

Modest is not the only modelling language for MA. We now briefly contrast it to the currently available alternative modelling languages with support for MA.

State Space Files for Imca. The first MA-specific algorithms were implemented in the Imca tool [27]. Its only input language is a text-based explicit state space format as illustrated for our example of \(M_1 \,{\parallel }\, M_2\) in Fig. 6. This is clearly not a useful modelling language, but a format to be automatically generated by tools.

Guarded Commands with Storm. The Storm model checker [18] provides many input languages, with MA being supported through a state space format similar to Imca ’s, via Jani, as the semantics of generalised stochastic Petri nets [20] in GreatSPN format [1], and through an extension of the Prism guarded command language. We show our example in the latter in Fig. 5. This is a very simple and small language that is easy to learn, however it completely lacks higher-level constructs to structure and compose models aside from the implicit parallel composition of its modules.

Process Algebra with Scoop. Mapa [51] is a dedicated process algebra for MA. It is supported by Scoop [51], which can linearise, reduce, and finally export Mapa models to Imca for verification. We show the example of \(M_1\) and \(M_2\) in Mapa in Fig. 4. As a classic concise process algebra, Mapa tends to be very succinct, but also difficult to read. Mapa models can be much more flexibly composed than Prism models, yet there is less syntactic structure than in Modest—although the languages conceptually share many operators. Mapa notably has a predefined queue datatype, and users can specify custom non-recursive datatypes.

Jani [12] is a model interchange format designed to ease tool development and interoperation. It is Json-based and thus human-debuggable, but not intended as human-writable. It represents networks of automata with variables symbolically. Since both the Modest Toolset and Storm support Jani, it is possible to e.g. build MA models in the Modest language, export them to Jani with moconv, and then verify them with Storm. Likewise in the other direction, we can e.g. create a Petri net with GreatSPN, convert to Jani with Storm, and analyse it with mcsta or modes. In this way, the most appropriate modelling language can be combined with the best analysis method and tool for every specific scenario. The JSON-based syntax however is too verbose to display the example in JANI format in this paper.

4 Optimising Attacks on Bitcoin

Bitcoin [45] is currently the most popular cryptocurrency. It is built on blockchain technology using the proof-of-work approach. Every block in the blockchain contains a nonce (a randomly chosen number), a set of (monetary) transactions, and a hash of the predecessor block in the chain. In this way, no past block can be changed without invalidating (the hashes in) all its successors. A block is valid if the hash of the block’s contents falls below a target value. To create a valid block, a node in the Bitcoin network repeatedly selects a new nonce until it finds one that makes the block valid. Creating new blocks is called mining, and overall constitutes the proof-of-work approach since the repeated hashing is computationally (and thus environmentally) expensive. As the computational power used for mining (the hash rate) changes, the Bitcoin network periodically adjusts the target value such that the average time to find a new block (the confirmation time) is 10 min. In practice, the actual confirmation time varies; it was about 12 min in 2017 [24]. Every node in the network stores its own copy of the entire blockchain. Once a new node finds a new valid block, it broadcasts the block to the network. Due to network delays, multiple new blocks may propagate at the same time. Nodes add the first block they receive to their local chain. Thus multiple forks of the blockchain may exist on different nodes. Each node always considers the longest chain known to it as valid, and miners extend the longest chain. A transaction is n-confirmed with confirmation depth \(n=0\) if it is not part of any valid block and otherwise with \(n>1\) if there are \(n - 1\) blocks in the chain beyond the block b that the transaction is part of. The amount of work to invalidate a fork that starts with b increases with n. Many services only accept Bitcoin payments once they are at least 6-confirmed [7].

In this section, we use Modest and the Modest Toolset to study two variants of a secret-fork attack on Bitcoin, inspired by the Andresen attack proposal and a study performed with Uppaal smc in [24]. The attackers secretly create a fork, keep mining on it until it reaches a certain length greater than that of the publicly known blockchain, and then publish it all at once. This would invalidate the public fork, with the private one becoming the valid blockchain. The original aim of the attack was to undermine the trust in Bitcoin; if it succeeds on the first attempted fork, it can equally be used for double spending by invalidating a specific transaction. For the attack to be feasible, the malicious attacker must control a significant fraction m of the hash rate.

4.1 Modelling and Evaluating the Double-Spending Attack

If the goal of the attacker is double spending, then it creates a transaction that spends some Bitcoin funds and announces it to the network for inclusion in the next block. At the same time, it starts mining on its own secret fork. Let \( cd \) be the confirmation depth after which a transaction is accepted by the receiver of the funds. If the attacker manages for its secret fork to become longer than the public fork, and longer than \( cd \), then it can publish this fork immediately after the public one reaches length \( cd \). At that point, the receiver of the funds has just accepted the transaction (and presumably fulfilled its part of the contract). The secret fork however invalidates the public one since it is longer, and thus invalidates the transaction. The attacker is now free to spend the same funds again. Due to the proof-of-work system, such an attack is possible, but—as long as the attacker controls less than \(50\%\) of the hash rate—has a low success probability and an immense computational cost.

Modelling the Attack in Modest. We build an abstract model of mining in Modest, reduced to the aspects relevant to the attack. The observation that a new block is mined every 12 min on average fits well with MA: we model block creation via Markovian transitions with a total rate of \(\frac{1}{12}\). We abstract from network delays, i.e. blocks propagate instantaneously. We consider a single attacker, assuming that the rest of the world’s miners behave in the normal “honest” manner and publish all mined blocks immediately.

Honest Mining Model. To start, we define a process representing the pool of honest miners, which control \((1 - m) \cdot 100\%\) of the global computational resources used for mining, with m realised as model parameter :

figure ao

Action models the propagation of a new block through the network, which can also be observed by the attacker. Due to the separation of timing and interaction in MA, we need two separate edges for mining delay and communication.

Attacker Model. We keep track of the length of the attacker’s fork, and of the difference in length to the public fork. To make the MA finite, we identify all fork lengths greater than \( cd \) with the value \( cd + 1\) (since we only need to know whether the fork is longer than \( cd \), but not how much longer), and we assume that the attacker gives up on its fork once it is \( db \) blocks shorter than the public one. The attacker process is then as follows:

figure aq

For illustration, we use the construct to implement a loop here instead of the recursive process call used in . A loop is in essence a looping : There is an initial nondeterministic choice between the child behaviours; once the chosen behaviour successfully terminates, control loops back to the nondeterministic choice. loops can be exited via the predefined action. We also use the shorthand: is syntactic sugar for . Thus the behaviour of the attacker process is as follows: it waits until it either mines a new block itself (first child behaviour of the loop), or until it observes a new block in the public fork. In both cases, it appropriately updates and . In the second case, it then either gives up if it has fallen too far behind, or otherwise continues the attack.

Composition and Nondeterminism. The overall behaviour of our model is the parallel composition of the two processes, with synchronisation on :

figure be

Observe that the behaviour of neither of the two processes contains an actual nondeterministic choice: is entirely sequential, and the choices in the attacker process are between a Markovian and a probabilistic edge (in ), i.e. the probability for both to be available at the same time is 0, and between two edges with disjoint guards (in ). Since the only probabilistic edge in synchronises with the attacker, and is immediately followed by a Markovian edge, the parallel composition cannot introduce nondeterminism due to interleaving probabilistic transitions, either. Thus the entire model takes the form of an MA, but is in fact equivalent to a CTMC. MA that are equivalent to CTMC are a class of models that occurs frequently in practice. Several of the MA models in the QVBS belong to this class.

Evaluating the Attack. We are interested in the probability that the attacker eventually wins, and that it eventually gives up without winning. We expect it to eventually either win or give up, thus—due to the absence of nondeterminism—the probabilities should sum to 1. We declare the two properties in Modest:

figure bj

To avoid repeating the expression that characterises the winning condition, we encapsulate it in the user-defined function . Functions in Modest can also take parameters, and they can be (mutually) recursive. The body of a function is an expression; since expressions in Modest are free of side effects, functions provide for pure functional programming inside Modest models. Combined with user-defined recursive datatypes (not shown in this paper), they make Modest Turing-complete. Property is straightforward: we ask for the (minimum) probability to eventually ( ) enter a state that satisfies the winning condition. Since there is no nondeterminism, there is no difference between and for this model. Property uses the until ( ) operator to ask for the probability of those paths on which no state satisfies until a state where is \( true \) is reached. If we invoke mcsta on this model by executing

./modest mcsta bitcoin-ds.modest -E "M = 0.2, CD = 6"

we obtain probability \(\approx 0.0087\) for and \(\approx 0.9913\) for : the attack is unlikely to succeed if the attacker controls only \(20\%\) of the hash rate. However, at \(m = 0.4\), we get , and at \(m = 0.5\), it is \(\approx 0.719\). It is not 1 here because the attacker gives up when falling behind too much. If we modify the model such that the attacker never gives up, it becomes an infinite-state MA since is no longer bounded from below. We cannot model-check this model, but due to the absence of nondeterminism, we can easily perform statistical model checking with modes by running

./modest modes bitcoin-ds-inf.modest -E "M=0.2, CD=6" --max-run-length 0

The output confirms our expectation that the probability is now 1, although we only know this with the statistical confidence provided by modes.

4.2 Optimising the Attack on Trust in Bitcoin

If the goal of the attack is to undermine the trust in the Bitcoin system by invalidating a large amount of work performed by the honest miners, the attacker gains some freedom in choices: Instead of having to give up when it gets too far behind, it can simply restart its attack from the then-current public fork. We thus keep the \( cd \) parameter, which now indicates the minimum desired length of the secret fork for it to be published. The winning condition becomes the length of the secret fork being greater than or equal to \( cd \). Instead of only giving up (which now means resetting the secret fork) when \( db \) blocks behind, the attacker can additionally choose to continue the attack or reset its fork every time that the honest mining pool publishes a new block.

Modelling the Attack. Our new attacker process, which replaces the process presented previously, is thus as follows:

figure by

This model is nondeterministic due to the choice between and in the attacker process. We use actions and to indicate the choice made; they have no synchronisation partner, but will help understand the optimal scheduler.

Evaluation. The probability for the attacker to eventually win as expressed by an adjusted version of is now 1 since it can retry indefinitely. It is thus more interesting to investigate the expected time until it wins:

figure ce

We ask for the minimum time here, i.e. for the attacker to make its choices such that the time to success is minimised, which arguably is its best strategy. mcsta reports that the value is \(\approx 3735.94\) minutes for \(m = 0.2\), i.e. a little over two and a half days. Let us thus compute the probability to succeed in just two days:

figure cf

We now ask for the maximum probability, since this again corresponds to an optimal attack. The result that mcsta gives is \(\approx 0.535\). As originally discovered in [24], we thus have a more than \(50\%\) chance to undermine the trust in Bitcoin if we control only \(20\%\) of the hash rate and invest only two days of mining. According to blockchain.com/pools, on July 8, 2019, the BTC.com pool in fact controlled \(21.6\,\%\) of the global hash rate; it could thus perform the attack.

Optimising the Attack Strategy. While the above numbers tell us the time and probability for the attack to succeed, they do not give any information about the attack strategy: What are the points, in terms of the length of the secret and public forks, where we should restart in order to obtain these optimal times and probabilities? Probabilistic model checking as implemented by mcsta, however, implicitly computes the optimal choice for every state of the MA underlying the model it checks, and it can be instructed to write this scheduler to a file:

./modest mcsta bitcoin-attack.modest -E "M=0.2,CD=6" --scheduler sched.txt

The result is a text file sched.txt with entries of the form

figure cg

for every state; here, in a state where the secret fork’s length is 1, and it is two blocks shorter than the public one, the attacker restarts. We processed the file by projecting to and and then eliminating all subsequent duplicate entries to find that the optimal strategy is to restart the attack if

  • the honest pool announces a block, but the secret fork is still empty,

  • the secret fork has one block and the public fork adds a third block, or

  • the secret fork has \({\ge }2\) blocks and gets 3 blocks shorter than the public one,

and to continue the attack in all other cases.

Summary. Throughout this section, we first built an MA model that was equivalent to a CTMC, and then a truly nondeterministic MA. However, even that model does not use all features of the MA formalism: it lacks discrete probabilistic branching. As such, it falls into the interactive Markov chain (IMC, [39]) subset of MA. In the next section, we will introduce a model that is a true MA.

5 Evaluating a Reentrant Queueing System

In the previous section, we considered quantitative aspects of attacks on a stochastic timed system. We now turn our attention to a prominent use of continuous-time Markov models: performance and dependability evaluation. A classic application is resource-sharing queueing systems, using various CTMC-based formalisms like (Jackson) queueing networks [41], with analytical or simulation-based techniques for the analysis. Yet these approaches are restricted, both in modelling and in analysis, to fully stochastic systems. MA as a model, and our analysis tools in the Modest Toolset, sit right at the edge between performance evaluation and model checking [2]. In particular, they add the concept of nondeterminism, which is at the core of classic qualitative model checking, to modelling formalisms and analysis algorithms that directly apply to performance evaluation scenarios. We now study a queueing system with stochastic timing, discrete probabilistic choices, and nondeterministic decisions—its model is thus an MA that does not fall into any of the existing subsets.

Fig. 7.
figure 7

A queuing system with postprocessing needs [36]

We consider the system with two queues depicted in Fig. 7, originally presented in [36]. Both queues have the same capacity c. Jobs arrive with rate \(\lambda \) and enter one of the queues according to the standard join-the-shortest-queue strategy. This strategy is implicitly nondeterministic if both queues are equally filled. For each queue, jobs are processed by a dedicated server, serving jobs with rates \(\mu _u\) and \(\mu _d\), respectively. Jobs leaving the lower server leave the system, while jobs once processed by the upper server are subject to an additional check. Dependent on the (nondeterministic) outcome thereof, they are either sent into the lower queue again (action d), or (action u) they may either leave the queue (with probability p) or reenter the upper queue (with probability \(1-p\)).

A Modest Model. As usual, we start our Modest model by declaring all relevant constants, including the model parameters without specified values:

figure cj

In this model, we will use two transient variables to track when jobs are done, and when a job is dropped because both queues are full on arrival, or the queue in which it is due to re-enter after being processed by the up server is full:

figure ck

Unlike regular variables in , transient variables do not become part of the states. They can be used in assignments, but the assigned values are lost once the successor state is entered. However, the assigned value is visible to properties when the branch is taken, and we will make use of this later to define rewards.

We structure our model along the components shown in Fig. 7, defining a Modest process for each of them. The arrivals process and the down server have the simplest behaviours:

figure cl

Both processes synchronise with the input queues: uses action to enqueue a job that just arrived, and uses action to obtain a job to work on when idle, as soon as one is available. We will use synchronisation vectors to ensure that the synchronisation on happens between and exactly one of the two queues. Both queues use the same process definition:

figure cs

To distinguish the two queues, we use a process parameter , and a two-element array storing the lengths of the queues that the processes index with their . Function indicates whether the queue with the given is no longer than the other one. A queue only accepts new jobs when is \( true \); if the queue is full in that case, the job is dropped, and is (temporarily) set to 1. The action removes a job from a non-empty queue.

Finally, the up server has the most complicated structure, since it manages the reentry of jobs that it has finished serving into the two queues:

figure da

The nondeterministic choice between and is a choice between ( ) making the job surely leave the system within a certain expected time, at the cost of processing by the slower down server, and ( ) taking the chance for the job to leave the system immediately, at the risk of it reentering the up queue. The optimal choice will likely depend on the current lengths of both queues.

Now that we have specified all the necessary processes, we can put them into a parallel composition. We have rather different synchronisation requirements: shall use a binary synchronisation between and one of the two queues, with a nondeterministic choice if both have the same lengths; in a queue shall synchronise only with the one server for that queue; and and shall look like a to the respective queues. We could declare as a , and cleverly use the construct to rename the other actions in a way that makes Modest create the correct synchronisation vectors internally. However, we can also just specify the desired vectors explicitly in a :

figure dp

If we read the “columns” in the above specification from bottom to top, we read the synchronisation vectors, with the topmost entry being the action that labels the synchronising edge in the composed , and corresponding to \(\bot \).

Performance Evaluation. We first add properties to investigate the probability and time until the queues are full, which is an undesirable condition that affects the dependability of the system by making it likely for jobs to be lost:

figure dr

We thus assert that the minimum probability for both queues to eventually be full is 1, which is a sanity check for the model; then we ask for the minimum and maximum of the expected time for both queues to be full, and of the probability for this to happen within 10 time units. By repeating the bottom two properties for different values of the time bound, we can obtain an approximation of the underlying cumulative distribution function over time. If we run mcsta with

./modest mcsta reentrant-q.modest -E"C=5" -O results.txt Minimal

we get an easy-to-parse file results.txt with the results:

figure ds

We see that the nondeterministic choices have a significant influence on the behaviour of the system; between the worst and best choices, the time to and probability for the undesirable event differs by a factor of 6 to 7. Since the standard probabilistic model checking algorithms implemented in mcsta are iterative numeric algorithms using double-precision floating-point numbers, every result is only an approximation of the true value despite the high number of decimal digits included in the output. The precision of mcsta is configurable.

Assume that we are designing a system of which our reentrant queueing system is an abstract model, and we have one parameter for which we must decide on a concrete value: the queue capacity c. We expect a higher capacity to improve throughput, utilisation, and reduce the number of lost jobs; however, it is also more costly to implement. We would thus like to find a good tradeoff between c and these quantities. We first specify properties that query for them:

figure dt

These queries are for long-run average rewards. The rewards are described by accumulation expressions: attaches to every branch (i.e. to every discrete step) the value of after the branch’s assignments have been executed (but before transient variables lose their values) as a branch reward. Expression sets the rate reward (accumulated over time) in every state to 1 if both queues are empty, and to 0 otherwise. We chose maximisation/minimisation as appropriate to correspond to the best possible strategy. We can ask mcsta to compute these quantities for many different values of c by specifying multiple experiments via the -E parameter:

figure dx

We visualise the results in Fig. 8. The two lines converging to zero plot (red, upper line) and (orange, lower). The other two lines plot (blue, upper) and (purple, lower). We see that the fraction of time that the servers spent idle drops quickly with increasing c, whereas throughput and loss do not improve so much. Looking at this plot, we might choose c around 5 to 8.

Summary. In this section, we built a model for a queueing system that utilises all the features of the MA formalism. mcsta offers algorithms to calculate a variety of quantities (cf. Sect. 2), and we fully utilised them to evaluate the system from several perspectives.

Fig. 8.
figure 8

Long-run average performance values for the reentrant queueing system

6 Conclusion

This tutorial paper has discussed how Modest can be used as a convenient modelling language for Markov automata, together with some hints on what analysis is possible for such models. Markov automata can be considered as a central model family for studying the performance, dependability, and correctness of randomised and distributed systems.

We introduced all the basic and several advanced constructs of the Modest language for MA. Among the features that we did not cover are exception handling (using the and constructs), the specification of values for transient variables in locations (using the construct), dynamic array constructors, user-defined recursive datatypes (which allow the specification of, for example, unbounded list types), recursive functions, and and actions (which automatically generate appropriate synchronisation vectors, just like “normal” actions do for multi-way synchronisation). Going beyond MA, Modest also supports the formalisms of probabilistic timed automata [42] (which add a type and time progress conditions via the construct), stochastic timed automata [4] (which allow sampling values from continuous probability distributions in assignments; they are a generalisation of MA), and stochastic hybrid automata [25] (which add continuous variables of type whose behaviour over time is specified via differential equations and inclusions using the operator for derivatives). Further Modest models are included in the Modest Toolset download, available at modestchecker.net, and in the Quantitative Verification Benchmark Set at qcomp.org.

Data Availability. The models, example command lines, and results presented in this paper are archived and available at DOI 10.4121/uuid:5a73169e-b494-411b-b3a8-051e62efba9e [31].