1 Introduction

In its simplest definition, runtime verification (RV) [26] solves the word problem: whether a certain property (for example, expressed as an \(LTL \) formula) is satisfied for a system run, given the run or a prefix of it. In recent years, advanced RV paradigms have emerged, such as stream runtime verification (SRV), extending the traditional notion of runtime verification. First, SRV allows computations and outputs over arbitrary data domains, not only atomic Boolean propositions and verdicts like for \(LTL \). Second, they specify “point-wise properties”, which assign outputs to every position of the trace (instead of a single verdict for the trace as a whole). This is especially useful to identify points in a trace, e.g. an error location.

Figure 1 shows an SRV specification in the pioneering formalism Lola  [9] (see Sect. 2), which we will use as a running example. The scenario models a vacuum cleaner robot in a house with four rooms, connected by open doors. The charging station is located in room \(r_0\). We want to check the following property: “The robot may not enter rooms if its battery is not charged enough to be able to reach the base station.” The Lola specification for this property defines four input streams of type Boolean \(r_0, \dots , r_3\) and one stream e of type real. A stream is a sequence of data values over time. The input streams originate from the robot system and are incrementally passed to the monitor. The values (events) of streams \(r_0,\dots ,r_3\) encode the current location of the robot while e contains the battery charge (between 0% and 100%). The Lola specification defines a Boolean output stream \(\textit{err}\) that defines the error: the robot is not in room 0 and its battery has run below 5%. Output streams contain events in synchrony with input streams, so the values at each input stream instant produce a value of \(\textit{err}\), revealing whether the system has run into an error. That is, this specification is a point-wise property. The specification also defines \(\textit{Ferr}\), which is true if either \(\textit{err}\) is true now or in the future (by referring to \(\textit{Ferr}\) at the next instant, with default value false at the trace end).

Fig. 1.
figure 1

Example Lola specification and room map for a vacuum cleaner robot.

Monitoring can be performed online or offline. In offline monitoring the input trace is completely known upfront, for example as a log file. On the other hand, an online monitor receives the trace event by event while the observed system is running. In this paper we deal with online runtime verification. There is a significant difference between both kinds of monitoring when the specification contains future references (as stream \(\textit{Ferr}\) above). Future references are not a problem in offline RV, because future input values can be easily accessed, but future values are unknown in online monitoring. In general there are two strategies for future references in online RV: (1) stalling calculations until all relevant input events are accessible [9]; (2) cast at each step an output as precise as possible with the information available (e.g. a set/interval of possible values). For a Boolean stream these outputs could be \(\top = \{\textit{tt}\}\), \(\bot = \{\textit{ff}\}\) or \(? = \{\textit{tt},\textit{ff}\}\) when both values are possible, depending on future inputs. This strategy is used in \(LTL _3\) monitoring [2], but not for point-wise properties. The online monitoring of point-wise properties—while emitting the best possible sets of valuations—is called perfect recurrent monitoring in [20, 21].

A stream being defined using future references does not necessarily imply that a ? verdict has to be cast. Consider \(\textit{Ferr}\) above: if the value of \(\textit{err}\) is true at some instant then \(\textit{Ferr}\) is true now, independently of future events. Moreover, additional knowledge about the monitored system available in the form of assumptions [4, 18, 24] allows to reduce the set of possible valuations. Consider again our running example and assume that the robot consumes 3% of energy when passing from one room to the next. We may conclude that room 0 is not reachable without dropping below 5% battery before (and thus \(\textit{Ferr}\) is \(\textit{true}\)), if the robot is in room 3 with an energy level below 8%. This kind of monitoring is called anticipatory [3].

In this paper we study the problem of anticipatory monitoring for Lola under assumptions and also uncertainties (missed or imprecise sensor values) in the input trace. While for a propositional logic, whether a prefix satisfies or violates a property in all continuations can be modeled using (Büchi) automata, whose emptiness can be effectively determined, the problem is more complex for richer domains. They require reasoning about satisfiability and validity in richer theories—which are computationally expensive or even undecidable—and require reasoning about all futures as finite formulas (instead of automata).

Related Work. Early RV research focused mostly on the monitoring of \(LTL \)  [29] properties. The \(LTL _3\) monitoring approach [2] was the first to consider anticipation, by reasoning about all possible trace continuations. More expressive RV formalisms were later introduced adding notions of time or complex data values in the traces. Examples include signal temporal logic (STL) [27], mission time \(LTL \)  [31], Eagle [14] or metric first order temporal logic (MFOTL) [1]. A prominent class of extended RV approaches is SRV, pioneered by Lola  [9], and later extended in asynchronous languages like RTLola [11, 12], TeSSLa [6, 22] and Striver [15, 16]. Many RV formalisms can be encoded in Lola  [20]. Recurrent monitoring [21] was first studied in [17] for past \(LTL \) and later extended with resets [4, 5], and also for Lola  [20]. The use of symbolic representations for monitoring (also to handle uncertainty) has recently been studied [4, 5, 10, 13, 34] and also applied to Lola  [19]. Considering assumptions during monitoring was first proposed in [24] (under different wording) and later successfully adapted and extended [4, 5, 19, 35]. The topic is theoretically studied in [18]. The approach that we present in this paper is based on the theory of abstract interpretation [7, 8], which was used in RV to handle uncertainties in [25].

The works closest to this paper are [5, 13] which study symbolic anticipatory \(LTL \) monitoring with linear arithmetic sub-formulas. The former [5] also considers uncertainties and assumptions.

In this paper we first introduce variations of the original Lola semantics: We give monitoring semantics which define the perfect monitoring results for uncertain stream prefixes. Based on this we define the instant and then (more importantly) transformer semantics, which also capture perfect monitoring outputs but discard unnecessary information about relations to all past and future events and can be deterministically computed. We then introduce a general abstraction framework for the effective computation of the transformer semantics and derive an efficient, anticipatory Lola monitoring algorithm. Provided with a sound or perfect abstraction for the stream values (e.g. one from the various literature on abstract interpretation) we present a general algorithm to monitor Lola specifications with future references. We give a criterion for the existence of perfect monitoring, and present a technique based on widening to produce a sound monitor if perfect monitoring is impossible. Then, we instantiate our general framework for linear real arithmetic specifications using symbolic computation. Finally, we report on an empirical evaluation of a prototype implementation of our approach on three complex case studies.

Contributions. Compared to previous works (esp. [5, 13]) the main contributions of our approach are:

  • The anticipated monitor outputs may be of richer data types than Boolean.

  • The monitor is able to produce arbitrarily many outputs per time step.

  • Instead of unrolling a specification from the beginning to handle anticipation, we unroll from the back until an invariant is found which is then used to efficiently look ahead during the actual monitoring.

  • If no perfect anticipation exists, we provide sound over-approximations instead.

  • We are not restricted to symbolic reasoning but provide a general abstraction-based monitoring framework.

2 Lola Monitoring Revisited

2.1 Recurrent Monitoring

Recurrent monitoring starts from a point-wise property, which assigns to every position of a trace a valuation. Traditionally, valuations are Boolean or other truth domains [33]. Here, we consider valuations from an arbitrary data domain.

Definition 1

(Point-wise property). A point-wise property \(\mathcal {P}\) of words of length n over domain \(\varGamma \) into domain \(\mathbb {D}\) is a function \(\mathcal {P}: \varGamma ^n \times \{1,2,\dots ,n\} \rightarrow \mathbb {D}\).

In online monitoring of point-wise properties, the input \(w \in \varGamma ^n\) is not available at once but provided incrementally, and the monitor produces an output after each input letter. A monitor may output several possible values from \(\mathbb {D}\), which in practice is encoded as an interval or ? (for all values). We identify a monitor with its characteristic function \(\overline{M}: \varGamma ^{\le n} \rightarrow 2^\mathbb {D}\) which maps prefixes of inputs to sets of possible outputs. After the first k letters of the input, a recurrent monitor [21] tries to evaluate the corresponding property at position k. A sound recurrent monitor outputs a super set of the possible verdicts at the current instant (compatible with all possible future input continuations). The monitor is perfect if it casts exactly the set of possible property valuations.

Definition 2

(Sound/perfect recurrent monitor). Given a point-wise property \(\mathcal {P}\) and a non-empty input prefix \(w \in \varGamma ^{\le n}\), the set of possible verdicts after w is \(\textit{pos} (w)=\{\mathcal {P}(wv,|w|) \mid v \in \varGamma ^{n-|w|}\}\). A recurrent monitor M for \(\mathcal {P}\) is sound whenever for every w, \(\overline{M}(w) \supseteq \textit{pos} (w)\). M is perfect if \(\overline{M}(w) = \textit{pos} (w)\).

2.2 Lola

A Lola specification defines a transformation from a tuple of input streams to a tuple of output streams. A finite stream of type \(\mathbb {D}\) over a time domain \(\mathbb {T}= \{0,1,\dots ,t_{\textit{max}} \}\) is a function \(s: \mathcal {S} _\mathbb {D}:= \mathbb {T}\rightarrow \mathbb {D}\) that assigns a data value to every instant in \(\mathbb {T}\). In this work we fix \(t_{\textit{max}} \) and thus \(\mathbb {T}\). We use sequences to represent streams and their prefixes. Given \(s = \langle 3, 4, 2 \rangle \) we use \(s(0) = 3\), \(s(1) = 4\), \(s(2) = 2\).

A Lola specification [9] is given as an equation system, which defines output streams in terms of input and other output streams. The set of Lola expressions over a set of stream identifiers S, \(\textit{Expr}_S\), is recursively defined as

$$ \textit{Expr}_S := c \mid f(\textit{Expr}_S, \dots , \textit{Expr}_S) \mid s[o|c] $$

where \(s \in S\) is a stream identifier, c a constant value, f a function symbol, and \(o \in \mathbb {Z}\) is an integer offset. A constant expression is interpreted as a stream with that constant value at all instants; a function application as the stream which results from the application of the function on the argument stream events at every instant. The operator s[o|c], called the offset operator, describes a stream which carries the values of stream s, shifted o instants. To refer to past events o can be chosen to be negative. If the accessed instant does not exist because it is beyond the trace ends (beginning or end) the default value c is used instead. For offset operators with offset 0, the default value does not play a role, thus we use the notation \(s[\textit{now} ]\) or simply s for s[0|c] for arbitrary constant c.

Syntax. A Lola specification \(\varphi = (I,S,E)\) is a 3-tuple where I is a finite set of input stream identifiers; S is finite set of output stream identifiers with \(I \cap S = \emptyset \); \(E : S \rightarrow \textit{Expr}_{I \cup S}\) assigns a defining expression to every output steam. For the rest of the paper, we assume that specifications are flat, i.e. they only contain offsets \(-1,0,+1\). Every specification can be flattened by introducing additional streams and splitting greater offsets to a chain of \(\pm 1\) offsets.

Semantics. The formal semantics of a Lola specification \(\varphi = (I,S,E)\) with input streams \(I = \{i_1,\dots ,i_n\}\) and output streams \(S = \{s_1,\dots ,s_m\}\) maps a tuple of concrete input streams to the corresponding tuple of concrete output streams as follows. Given a tuple of input streams \(\varSigma = (\sigma _1,\dots ,\sigma _n)\) the semantics \(\llbracket e \rrbracket _\varSigma \in \mathcal {S} _{\mathbb {D}}\) of an expression \(e \in \textit{Expr}_{I \cup S}\) of type \(\mathbb {D}\) is:

  • \(\llbracket c \rrbracket _\varSigma (t) = c\)

  • \(\llbracket f(e_1,\dots ,e_n) \rrbracket _\varSigma (t) = f(\llbracket e_1 \rrbracket _\varSigma (t),\dots ,\llbracket e_n \rrbracket _\varSigma (t))\)

  • \(\llbracket i_j[o|c] \rrbracket _\varSigma (t) = {\left\{ \begin{array}{ll} \sigma _j(t+o) &{} \text { if } t+o \in \mathbb {T}\\ c &{} \text { otherwise}\end{array}\right. }\)

  • \(\llbracket s_j[o|c] \rrbracket _\varSigma (t) = {\left\{ \begin{array}{ll} \llbracket E(s_j) \rrbracket (t+o) &{} \text { if } t+o \in \mathbb {T}\\ c &{} \text { otherwise}\end{array}\right. }\)

The semantics of \(\varphi \), \( \llbracket \varphi \rrbracket : \mathcal {S} _{\mathbb {D}_1} \times \dots \times \mathcal {S} _{\mathbb {D}_n} \rightarrow \mathcal {S} _{\mathbb {D}'_1} \times \dots \times \mathcal {S} _{\mathbb {D}'_m} \) is given as

$$ \llbracket \varphi \rrbracket (\varSigma ) = (\!\!\llbracket E(s_1) \rrbracket _{\varSigma }, \dots , \llbracket E(s_m) \rrbracket _{\varSigma }) $$

This Lola semantics is well-defined if the value of no stream event is dependent on itself. This is the case when the graph of the specification contains no self-loops, which can easily be checked [9]. We assume that all Lola specifications are well-defined. With \({\textbf {D}}:= \mathbb {D}_1 \times \dots \times \mathbb {D}_n\) and \({\textbf {D}}' := \mathbb {D}'_1 \times \dots \times \mathbb {D}'_m\), the induced pointwise property of a specification \(\varphi \) is the function \(\mathcal {P}_\varphi : {\textbf {D}}^{t_{\textit{max}}} \times \mathbb {T}\rightarrow {\textbf {D}}'\) defined as

$$ \begin{array}{l} \mathcal {P}_\varphi (w, t) = (s_1(t),\dots ,s_m(t)) \end{array} $$

where \((s_1,\dots ,s_m) = \llbracket \varphi \rrbracket (w)\). Thereby we implicitly understand w as a tuple of streams.

Assumptions. Assumptions are knowledge about system and environment [18], which allow to restrict the actual set of possible input and output traces. Consider again Fig. 1. First, the robot can only be in one room at a time, so exactly one of \(r_0,r_1,r_2,r_3\) must be true at any instant. The map also limits the transitions, so if \(r_1\) is true at some instant, only \(r_0, r_1, r_3\) can be true at the next instant, but not \(r_2\). We can also make assumptions about energy consumption (for example at least \(3\%\) of energy is used at every instant). We follow [19] and encode assumptions in Lola, using a special stream \(\varLambda \) which we assume to be true at every instant. The assumptions above are e.g. encoded as follows:

$$ \begin{array}{rl} {\textbf {Def }} \varLambda := &{} (r_0[\textit{now} ] \leftrightarrow \lnot (r_1[\textit{now} ] \vee r_2[\textit{now} ] \vee r_3[\textit{now} ])) \wedge \dots \wedge \\ &{} (r_0[\textit{now} ] \rightarrow (r_0[1|\textit{tt}] \vee r_1[1|\textit{tt}] \vee r_2[1|\textit{tt}])) \wedge \dots \wedge \\ &{} (e[\textit{now} ] \le e[-1|103] - 3) \end{array} $$

Given a specification \(\varphi \) with assumption \(\varLambda \) and a tuple of input streams \(\varSigma = (\sigma _1,\dots ,\sigma _n)\) we write \(\varSigma \,\models _\varLambda \,\varphi \) if \(\llbracket \varphi \rrbracket (\varSigma )\) yields an output that only contains \(\textit{tt}\) events for \(\varLambda \).

Recurrent Lola Monitoring. Based on Definition 2 we define a sound and perfect recurrent Lola monitor as a recurrent monitor for the induced point-wise property of a specification, taking assumptions into account.

Given a Lola specification \(\varphi \) over input data types \({\textbf {D}}\) and given assumption \(\varLambda \), the set of possible verdicts after a non-empty input prefix \(w \in {\textbf {D}}^{\le t_{\textit{max}}}\) is \(\textit{pos} _\varphi (w)=\{\mathcal {P}_\varphi (wv,|w|-1) \mid wv \in {\textbf {D}}^{t_{\textit{max}} +1} \wedge wv\, \models _\varLambda \,\varphi \}\).

Definition 3

(Sound/perfect recurrent Lola monitor). A recurrent Lola monitor M is:

  • sound    iff for every non-empty \(w\in {\textbf {D}}^{\le t_{\textit{max}}}\), \(\overline{M} (w)\supseteq \textit{pos} _\varphi (w)\).

  • perfect iff for every non-empty \(w\in {\textbf {D}}^{\le t_{\textit{max}}}\), \(\overline{M} (w)=\textit{pos} _\varphi (w)\).

Lola monitors receive input streams instant by instant and, per input, cast the set (or an over-approximation) of the possible output stream value tuples.

Several monitoring approaches can be reduced to recurrent monitoring by modification of the specification. For example, consider a Boolean stream b representing a property. The initial value of this property (the value of b at position 0) can iteratively be monitored by introduction of an additional stream \({\textbf {Def }} s = {\textbf {if }} \textit{first} {\textbf { then }} b[\textit{now} ]{\textbf { else }} s[-1|\textit{ff}]\). Note that s at instant 0 takes the value of b and otherwise takes the previous value of s. A recurrent monitor for s outputs increasingly precise verdicts about the initial property b. This monitor simulates the typical initial monitor, for example for LTL\(_3\) [2]. Recurrent Lola monitors further subsume monitoring with reset [4]; monitoring instants with a fixed offset of k to the current instant, or a fixed size window around the current instant; monitoring the distance to the next instant where a violation of a property occurs (see [21]) or counting of violations, etc. All these notions can be solved with recurrent monitoring by introducing additional streams in the specification.

Perfect recurrent monitoring requires reasoning about possible future continuations of a trace. This ability however, especially together with the presence of assumptions makes recurrent monitors very powerful. The vacuum cleaning robot example above could include the following four stream definitions:

$$ {\textbf {Def }} \textit{enter}_{i \in \{0,1,2,3\}} := r_i[+1|\textit{false}] \wedge \lnot \textit{Ferr} [\textit{now} ] $$

Note that if a recurrent monitor yields the verdict \(\bot = \{\textit{ff}\}\) for one of these streams, entering the corresponding room will inevitably cause \(\textit{Ferr} \) to be true, which means that the base station cannot be reached anymore with the remaining battery energy. On the other hand, the verdict \(? = \{\textit{tt},\textit{ff}\}\) implies that it is possible that \(\textit{Ferr} \) is false when the corresponding room is entered. This way a higher level planning system an use the information that the monitor provides to steer and prevent the robot from going into rooms which will inevitably cause an error. If the robot always follows a path where ? verdicts are obtained it will eventually end up in room 0 if the battery level is critical. In this example anticipatory verdicts are possible if assumptions that are included in the specification reveal information about where the robot can drive and how much energy it consumes.

3 Lola Recurrent Online Monitoring Semantics

We now introduce a novel Lola semantics for recurrent online monitoring. While the original semantics from Sect. 2 describes a relation between fully known input and output streams (i.e. an offline semantics), we now give a semantics that relates prefixes of input streams with partially known output streams. We base our definition on monitoring stream tuples (inspired by [32]) which represent a set of possible (complete and fully known) stream tuples:

Definition 4

(Monitoring stream tuple). A monitoring stream tuple of n streams of types \(\mathbb {D}_1, \dots , \mathbb {D}_n\) is an element from \(\mathcal {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n} := 2^{\mathcal {S} _{\mathbb {D}_1} \times \dots \times \mathcal {S} _{\mathbb {D}_n}}\).

We will use monitoring stream tuples in two ways: (1) to define input stream prefixes, which are only known up to a certain instant \(t \in \mathbb {T}\); and (2) to encode uncertain input readings. (Note that the first case is a special case of the second, where all events after t are fully unknown.) The idea is that the monitoring stream tuple is the set of all complete and fully known input streams that are compatible with the (uncertain) input readings received so far.

Example 1

Consider again the robot example from Fig. 1 for \(\mathbb {T}= \{0,1,2,3,4\}\) and where the received trace prefix is known up to instant 3. Assume that the robot started at room \(r_0\) and moved to \(r_1\) and then to \(r_3\); then it is uncertain whether the robot remained in \(r_3\) or moved back to \(r_1\) again. Furthermore, the energy started at \(100\%\) and was reduced by \(3\%\) per step, but the sensor has an uncertainty of \(\pm 1 \%\). This input would be encoded by the following monitoring stream tuple, where the streams follow the order \(r_0\), \(r_1\), \(r_2\), \(r_3\), e:

$$ \begin{array}{rl} s = &{} \{(\langle \textit{tt}, \textit{ff}, \textit{ff}, \textit{ff}, r_0^4 \rangle , \langle \textit{ff}, \textit{tt}, \textit{ff}, r_1^3, r_1^4 \rangle , \langle \textit{ff}, \textit{ff}, \textit{ff}, \textit{ff}, r_2^4 \rangle , \langle \textit{ff}, \textit{ff}, \textit{tt}, r_3^3, r_3^4 \rangle ,\\ &{}\langle e^0, e^1, e^2, e^3, e^4 \rangle ) \mid \\ &{}r_1^3 \leftrightarrow \lnot r_3^3, e^0 \in [99,101], e^1 \in [96,98], e^2 \in [93,95], e^3 \in [90,92]\} \end{array} $$

Given a monitoring stream tuple \(s \in \mathcal {T}_{\mathbb {D}_1 \times \dots \times \mathbb {D}_n}\) we use s(t) for \(t \in \mathbb {T}\) to denote the set of all value tuples at position t. In the example above \(s(3) = \{(\textit{ff},r_1^3,\textit{ff},r_3^3,e_3) \mid r_1^3 \leftrightarrow \lnot r_3^3, e_3 \in [90,92]\}\).

In this paper we restrict to “instant-wise uncertainty”: our monitoring streams only encode uncertain values which are independent from the values at other instants. That is, we can encode that the robot is in room 3 iff it is not in room 0, but not that the robot is in room 3 if it was in room 0 in the previous instant. In many cases relations among instants can still be encoded as assumptions.

To simplify the definitions, for the rest of the paper we fix a Lola specification \(\varphi = (I,S,E)\) with n input streams of type \(\mathbb {D}_{1 \le i \le n}\) and m output streams of type \(\mathbb {D}'_{1 \le i \le m}\). A monitoring stream tuple \(\varSigma \) for the input is then \(\varSigma \in \mathcal {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n}\). We define the monitoring semantics of a Lola specification as the application of the standard Lola semantics on all streams from the input monitoring stream tuple.

Definition 5

(Lola monitoring semantics). Let \(\varphi \) be a specification and \(\varSigma \) the monitoring stream tuple for the inputs. The monitoring semantics of \(\varphi \), \(\varSigma \) is defined as:

$$ \begin{array}{l} \llbracket \varphi \rrbracket ^{\textit{mon}}: \mathcal {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n} \mathrel {\rightarrow }\mathcal {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n, \mathbb {D}'_1, \dots , \mathbb {D}'_m} \\ \llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma ) = \{(\sigma _1, \dots , \sigma _n) \circ \llbracket \varphi \rrbracket (\sigma _1, \dots , \sigma _n) \mid (\sigma _1,\dots ,\sigma _n) \in \varSigma \} \end{array} $$

We handle assumptions by adding the condition \( (\sigma _1,\dots ,\sigma _n)\,\models _\varLambda \,\varphi \) which restrict the input streams considered. The Lola monitoring semantics is closely related to a perfect recurrent Lola monitor: the output of a perfect recurrent Lola monitor after receiving input \(\varSigma \) at monitoring step t is \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma )(t)\). Receiving tuples \(\varSigma _0,\varSigma _1,\varSigma _2\dots \) with growing information about input readings a monitor could compute \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _0)\), \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _1)\), \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _2)\), \(\dots \) and generate the outputs \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _0)(0)\), \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _1)(1)\), \(\llbracket \varphi \rrbracket ^{\textit{mon}}(\varSigma _2)(2), \dots \). This monitor, however, computes a monitoring stream tuple of all inputs and outputs so it contains information about all events of all streams, which makes semantics costly. Note that for recurrent monitoring we are actually only interested in the events at the current instant. Therefore, in the following we introduce a variation of the Lola monitoring semantics which produces sets of possible stream value combinations (called configurations) for every instant, with no information relating different instants.

We first introduce some additional notation. Given a flat specification \(\varphi = (I,S,E)\) for input stream types \(\mathbb {D}_1,\dots ,\mathbb {D}_n\) and output stream types \(\mathbb {D}'_1,\dots ,\mathbb {D}'_m\), we use \( {\textbf {D}}^{\varphi } = \mathbb {D}_1 \times \dots \times \mathbb {D}_n \times \mathbb {D}'_1 \times \dots \times \mathbb {D}'_m \) to denote the product of all stream types. Given \(d \in {\textbf {D}}^{\varphi }\) and \(s \in I \cup S\) we use d(s) to denote the entry of stream s in d. Elements from \(2^{{\textbf {D}}^{\varphi }}\), i.e. sets of stream value tuples, are called configuration sets. Given an expression \(e \in \textit{Expr}_{I \cup S}\) of type \(\mathbb {D}\), the following three functions \(\llbracket e \rrbracket ^{\triangleright }_\varphi \) and \(\llbracket e \rrbracket ^{\triangleleft }_\varphi \) (with type \({\textbf {D}}^{\varphi } \times {\textbf {D}}^{\varphi } \rightarrow \mathbb {D}\)), and \(\llbracket e \rrbracket ^{\bowtie }_\varphi \) (with type \({\textbf {D}}^{\varphi } \times {\textbf {D}}^{\varphi } \times {\textbf {D}}^{\varphi } \rightarrow \mathbb {D}\)) compute the value of e at the beginning, at the end and in the middle of the trace. \(\llbracket e \rrbracket ^{\triangleright }_\varphi \) receives the configuration for the current and subsequent instant, \(\llbracket e \rrbracket ^{\triangleleft }_\varphi \) receives the current and previous instant, and \(\llbracket e \rrbracket ^{\bowtie }_\varphi \) the configuration for the previous, current and subsequent instant. This semantics are:

$$ \begin{array}{rcl} \llbracket d \rrbracket ^{\bowtie }_\varphi (b,c,a) &{} = &{} d \\ \llbracket f(e_1,\dots ,e_n) \rrbracket ^{\bowtie }_\varphi (b,c,a) &{} = &{} f(\llbracket e_1 \rrbracket ^{\bowtie }_\varphi (b,c,a),\dots ,\llbracket e_n \rrbracket ^{\bowtie }_\varphi (b,c,a)) \\ \llbracket s[-1|d] \rrbracket ^{\bowtie }_\varphi (b,c,a) &{} = &{} b(s) \\ \llbracket s[\textit{now} ] \rrbracket ^{\bowtie }_\varphi (b,c,a) &{} = &{} c(s) \\ \llbracket s[+1|d] \rrbracket ^{\bowtie }_\varphi (b,c,a) &{} = &{} a(s) \end{array} $$

for constant \(d \in \mathbb {D}\), stream identifier \(s \in I \cup S\) and sub-expressions \(e_1,\dots ,e_n \in \textit{Expr}_{I \cup S}\). Here, b denotes the valuation at the previous instant, c at the current instant and a at the successor instant. The definitions for \(\llbracket e \rrbracket ^{\triangleright }_\varphi \) and \(\llbracket e \rrbracket ^{\triangleleft }_\varphi \) are analogous, but these use the default value for -1 and +1 references (resp.). Let \(\varphi = (I,S,E)\) and let \(S = \{s_1,\dots ,s_n\}\) be the output stream identifiers. We use

$$ \begin{array}{rclll} \llbracket \varphi \rrbracket ^{\triangleleft }(b,c) &{} = &{} (\llbracket E(s_1) \rrbracket ^{\triangleleft }_{\varphi }(b,c), &{}\dots , &{}\llbracket E(s_n) \rrbracket ^{\triangleleft }_{\varphi }(b,c)) \\ \llbracket \varphi \rrbracket ^{\triangleright }(c,a) &{} = &{} (\llbracket E(s_1) \rrbracket ^{\triangleright }_{\varphi }(c,a), &{}\dots , &{}\llbracket E(s_n) \rrbracket ^{\triangleright }_{\varphi }(c,a)) \\ \llbracket \varphi \rrbracket ^{\bowtie }(b,c,a) &{} = &{} (\llbracket E(s_1) \rrbracket ^{\bowtie }_{\varphi }(b,c,a), &{}\dots ,&{} \llbracket E(s_n) \rrbracket ^{\bowtie }_{\varphi }(b,c,a)) \end{array} $$

to denote the application of the given functions on all defining expressions of \(\varphi \).

We can finally define an alternative fixed point semantics which can serve as the basis for recurrent monitoring.

Definition 6

(Lola instant semantics). Let \(\varphi \) be a specification and \(\varSigma \) a monitoring stream tuple of the input streams. The instant semantics fixed point equation of \(\varphi \), \(\varSigma \) is:

$$ \begin{array}{lcl} \llbracket \varphi \rrbracket ^{\textit{inst}}_{\varSigma } &{}:&{} (2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \rightarrow (2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \\ \llbracket \varphi \rrbracket ^{\textit{inst}}_{\varSigma }(V) &{}=&{} (V'_0,\dots ,V'_{t_{\textit{max}}})\\ \end{array} $$

with

$$ \begin{array}{lcl} V'_0 &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleright }(c,a), \sigma \in \varSigma (0), a \in V(1) \} \\ V'_t &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\bowtie }(b,c,a), \sigma \in \varSigma (t), b \in V(t-1), a \in V(t+1)\} \\ V'_{t_{\textit{max}}} &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleleft }(b,c), \sigma \in \varSigma (t_{\textit{max}}), b \in V(t_{\textit{max}}-1)\}. \\ \end{array} $$

The instant semantics of \(\varphi \) is given as the greatest fixed point of \(\llbracket \varphi \rrbracket ^{\textit{inst}}_{\varSigma }\) w.r.t. the point-wise \(\subseteq \) order on the \((2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|}\) structure:

$$ \begin{array}{lcl} \llbracket \varphi \rrbracket ^{\textit{inst}} &{}:&{} \mathcal {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n} \rightarrow (2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \\ \llbracket \varphi \rrbracket ^{\textit{inst}}(\varSigma ) &{}=&{} \nu (\llbracket \varphi \rrbracket ^{\textit{inst}}_{\varSigma })\\ \end{array} $$

The instant semantics fixed point equation takes a structure of configuration sets for every trace position, and returns a homogeneous structure consisting of the possible inputs and the semantics of the output stream expressions for the corresponding positions (based on the argument structure). Consequently, a fixed point of this equation is a solution of the Lola specification. We define the instant semantics as the greatest fixed point of the instant semantics fixed point equation. One structure is greater or equal than another if at every instant it contains at least the same configurations, i.e. is the point-wise application of \(\subseteq \). Note that the instant semantics of \(\varphi \) is equivalent to the monitoring semantics with respect to the stream events at every instant, that is

figure c

Hence, this semantics can also be used as basis for recurrent monitoring. Computing this semantics, however, is rather complex—requiring a fixed point iteration—and it must be recomputed every time new inputs are received (since \(\varSigma \) changes). Therefore, we slightly adjust this semantics again. Instead of computing the possible value combinations (configurations sets) we now compute them parametric in the values of the previous instant, using the structure \(({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|}\) instead of \((2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|}\). We call the elements of this structure transformers as they transform the configurations from the previous instant to those of the current instant. Transformers receive a configuration \(b \in {\textbf {D}}^{\varphi }\) at \(t \in \mathbb {T}\) and return the set of all possible configurations at \(t+1 \in \mathbb {T}\), provided b.

Definition 7

(Lola transformer semantics). Let \(\varphi \) be a specification and \(\varSigma \) an input stream tuple. The transformer semantics fixed-point equation of \(\varphi \) and \(\varSigma \) is given as:

$$ \begin{array}{lcl} \llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma } &{}:&{} ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \rightarrow ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \\ \llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma }(V) &{}=&{} (V'_0,\dots ,V'_{t_{\textit{max}}}) \end{array} $$

with

$$ \begin{array}{lcl} V'_0(b) &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleright }(c,a), \sigma \in \varSigma (0), a \in V(1)(c) \} \\ V'_t(b) &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\bowtie }(b,c,a), \sigma \in \varSigma (t), a \in V(t+1)(c)\} \\ V'_{t_{\textit{max}}}(b) &{}=&{} \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleleft }(b,c), \sigma \in \varSigma (t_{\textit{max}})\}.\\ \end{array} $$

The transformer semantics of \(\varphi \) is the (only) fixed point of \(\llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma }\):

$$ \begin{array}{lcl} \llbracket \varphi \rrbracket ^{\textit{tra}} &{}:&{} \mathbb {T}_{\mathbb {D}_1, \dots , \mathbb {D}_n} \rightarrow (2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|} \\ \llbracket \varphi \rrbracket ^{\textit{tra}}(\varSigma ) &{}=&{} \mu (\llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma })\\ \end{array} $$

This semantics is basically equivalent to the instant semantics except that \(V'_t\) is no longer dependent on \(V(t-1)\), as the generated transformers are parameterized in the configuration of their previous instant. Therefore, b is now a parameter of the single structure entries and a is still received from the argument structure of the fixed point equation, by applying the current configuration on the subsequent transformer (\(V(t+1)(c)\)).

This new semantics has several advantages for online monitoring. First, the fixed point of the upper semantics is unique and can (as opposed to monitoring and instant semantics) be deterministically computed from the back, as the single transformer elements only depend on the subsequent transformer. Second, this semantics can still conveniently be used for recurrent monitoring. One can mutually compute the current monitor state (i.e. the currently possible stream configurations) and the transformer to the subsequent instant and apply the current state on the transformer (see Sect. 5). However, one caveat is that computing with \(({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})^{|\mathbb {T}|}\) is complex, as it is unclear how to represent the elements in \({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }}\) and in \(2^{{\textbf {D}}^{\varphi }}\). Furthermore, the recursively defined sets \(V'_i\) are hard to determine. Therefore, we introduce a framework for abstract computation of this semantics.

4 An Abstraction Framework for Lola Monitoring

We borrow concepts from abstract interpretation to efficiently implement the transformer semantics. The main element is an abstract domain which is a perfect representation (or a sound over-approximation) of the transformer or configuration set domain. An appropriate abstract domain must be easy to represent in memory and enable efficient computations.

We introduce two domains: A, whose elements abstract concrete configuration sets from Sect. 3, and \(\tilde{A}\) that contains abstractions of the transformers. We require that \((A,\sqsubseteq ^A)\) and \((\tilde{A},\sqsubseteq ^{\tilde{A}})\) are complete lattices, that is, partial orders where every subset has a least upper bound and a greatest lower bound. The relation \(a \sqsubseteq ^A b\) indicates that b over-approximates a, i.e. that every configuration represented by a is also represented by b. The same holds for \(\sqsubseteq ^{\tilde{A}}\). We demand the existence of functions:

$$ \begin{array}{lllll} \gamma ^A: &{} A \rightarrow 2^{{\textbf {D}}^{\varphi }} &{}\;&{} \alpha ^A: &{} 2^{{\textbf {D}}^{\varphi }} \rightarrow A \\ \gamma ^{\tilde{A}}: &{} \tilde{A} \rightarrow ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }}) &{}\;&{} \alpha ^{\tilde{A}}: &{} ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }}) \rightarrow \tilde{A} \end{array} $$

which are able to translate from the concrete configuration set or transformer domain to the abstract counterpart and back. We require that these function pairs are Galois connections:

$$ \begin{array}{lcl} \forall a \in A, c \in 2^{{\textbf {D}}^{\varphi }} &{}:&{} \alpha ^A(c) \sqsubseteq ^A a \leftrightarrow c \subseteq \gamma ^A(a) \\ \forall a \in A, c \in ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})&{}:&{} \alpha ^{\tilde{A}}(c) \sqsubseteq ^{\tilde{A}} a \leftrightarrow c \trianglelefteq \gamma ^{\tilde{A}}(a) \end{array} $$

Here, \(\trianglelefteq \) denotes the pointwise application of \(\subseteq \) on all corresponding configurations sets where the functions from \(({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})\) map to. Galois connections ensure that a translation from the concrete to the abstract domain and back leads to an over-approximation, so abstract computations in the abstract domain produce sound monitor outputs.

We say that A is a perfect configuration set abstraction if for all \(c \in 2^{{\textbf {D}}^{\varphi }}\), \(\gamma ^A(\alpha ^A(c)) = c\). Analogously \(\tilde{A}\) is a perfect transformer abstraction if for all \(c \in ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})\), \(\gamma ^{\tilde{A}}(\alpha ^{\tilde{A}}(c)) = c\).

Symbolic Abstraction. We introduce now a perfect abstract transformer and configuration set abstract domain based on symbolic constraints, which will be later used for an anticipatory Lola monitoring algorithm in Sect. 6. For the symbolic abstraction we use symbolic constraints (i.e. quantifier-free first order logic expressions) that perfectly describe the relation among all possible values of a configuration or transformer.

We start with the symbolic representation of the configuration sets. We use a symbolic constraint where every stream value is represented by its own variable. For example, \(C = \{(\textit{tt}, 3),(\textit{ff}, 5)\}\)—for two streams b (of type bool) and r (of type real)—captures values that can either be \(\textit{tt}\) and 3 or \(\textit{ff}\) and 5, This configuration set can be expressed as \((b \rightarrow (r=3)) \wedge (\lnot b \rightarrow (r=5))\). Our symbolic computation is restricted to those configuration sets which are symbolically representable, thus the theory of choice (e.g. Boolean algebra or linear real arithmetic) determines the capabilities of the monitor. We assume that the chosen algebra can encode all monitor inputs and operations in the specification.

The concretization function of a symbolic constraint \(\psi \) is:

$$ \begin{array}{l} \gamma (\psi ) = \{v \in {\textbf {D}}^{\varphi } \mid \big ( \bigwedge \limits _{s \in I \cup S} s = v(s) \big )\,\models \,\psi \} \end{array} $$

Recall that v(s) denotes the value of stream s in a configuration \(v \in {\textbf {D}}^{\varphi }\). We implicitly define \(\alpha \) s.t. for any configuration set \(C \in 2^{{\textbf {D}}^{\varphi }}\), \(\gamma (\alpha (C)) = C\). That is, every configuration set C has a canonical symbolic encoding. In the algorithm we only require \(\alpha \) for translating uncertain input readings to symbolic representations. Note that by the given definition of \(\alpha \) the symbolic domain is a perfect configuration set abstraction. Also note that while our symbolic domain is defined as abstraction of configuration sets over all streams, it is also possible to encode only sets of sub-configurations, e.g. only input stream values.

Consider for example Fig. 1 and the following configuration set \(v = \{(\textit{ff},\) \(r_1^3,\) \(\textit{ff},\) \(r_3^3,\) \(e_3) \mid \lnot (r_1^3 \leftrightarrow r_3^3), e_3 \in [90,92]\},\) which represents the uncertain input for instant 3 from the example above. A symbolic representation of this configuration set is \( \alpha (v) = \lnot r_0 \wedge \lnot r_2 \wedge \lnot (r_1 \leftrightarrow r_3) \wedge (90 \le e \le 92). \)

We also encode transformers symbolically, extending the variables of our constraints to \(I \cup S \cup \{s^{-1} \mid s \in I \cup S\}\), where \(s^{-1}\) represent the stream values at the previous instant in which the transformer is parametric. The corresponding concretization function for transformers is given as \(\gamma (\psi ) = \tau \) s.t.

$$ \begin{array}{l} \forall v \in {\textbf {D}}^{\varphi }: \tau (v) = \{u \in {\textbf {D}}^{\varphi } \mid \big ( \bigwedge \limits _{s \in I \cup S} ((s^{-1} = v(s)) \wedge (s = u(s))) \big )\,\models \,\psi \}. \end{array} $$

Abstract Transformer Semantics Computation. We now present the computation of an alternative, abstract transformer semantics, related to the concrete semantics given in Definition 7. This semantics is computed in an \(\tilde{A}^{|\mathbb {T}|}\) structure where each entry contains the abstract transformer for the corresponding trace position.

We fix an abstract transformer domain \(\tilde{A}\) with translation functions \(\gamma ^{\tilde{A}}: \tilde{A} \rightarrow ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }})\) and \(\alpha ^{\tilde{A}}: ({\textbf {D}}^{\varphi } \rightarrow 2^{{\textbf {D}}^{\varphi }}) \rightarrow \tilde{A}\).

Definition 8

(Abstract Lola transformer semantics). A fixed point equation for \(\varphi \), \(\varSigma \) is called abstract Lola transformer fixed point equation if

$$ \begin{array}{lcl}&{}&{} \\ \llbracket \varphi \rrbracket ^{\sharp }_{\varSigma } &{}:&{} \tilde{A}^{|\mathbb {T}|} \rightarrow \tilde{A}^{|\mathbb {T}|} \\ \llbracket \varphi \rrbracket ^{\sharp }_{\varSigma }(V) &{}=&{} (\tau _{\varphi ,\varSigma }^0(V(1)),\tau _{\varphi ,\varSigma }^1(V(2)),\dots ,\tau _{\varphi ,\varSigma }^{t_{\textit{max}}})\\ \end{array} $$

with \(\tau _{\varphi ,\varSigma }^{t_{\textit{max}}} : \tilde{A}\) and \(\tau _{\varphi ,\varSigma }^t : \tilde{A} \rightarrow \tilde{A}\) for \(t \in \{0,\dots ,t_{\textit{max}}-1\}\) s.t.

$$ \begin{array}{lcl} &{}&{} \\ \tau _{\varphi ,\varSigma }^0(V_1) &{} \sqsupseteq ^{\tilde{A}} &{} \alpha ^{\tilde{A}}(b \mapsto \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleright }(c,a) \mid \sigma \in \varSigma (0), a \in \gamma ^{\tilde{A}}(V_{1})(c) \}) \\ \tau _{\varphi ,\varSigma }^t(V_{t+1}) &{} \sqsupseteq ^{\tilde{A}} &{} \alpha ^{\tilde{A}}(b \mapsto \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\bowtie }(b,c,a) \mid \sigma \in \varSigma (t), a \in \gamma ^{\tilde{A}}(V_{t+1})(c)\}) \\ \tau _{\varphi ,\varSigma }^{t_{\textit{max}}} &{} \sqsupseteq ^{\tilde{A}} &{} \alpha ^{\tilde{A}}(b \mapsto \{c \mid c = \sigma \circ \llbracket \varphi \rrbracket ^{\triangleleft }(b,c) \mid \sigma \in \varSigma (t_{\textit{max}})\}).\\ \end{array} $$

This corresponds to a computation in the abstract structure \(\tilde{A}^{|\mathbb {T}|}\) where all the entries are over-approximations of the transformers of the concrete Lola transformer semantics. If the \(\sqsupseteq ^{\tilde{A}}\) relation in the above definitions is an equality then \(\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma }\) is called a perfect abstract Lola transformer fixed point equation. We will later in Sect. 6 provide the abstract transformer constructors \(\tau _{\varphi ,\varSigma }^t\) for the symbolic abstract domain introduced above.

As in the concrete case, the abstract transformer fixed point equation above has a unique fixed point \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })\), as it can be computed deterministically from back to front given a particular input \(\varSigma \). We say that our abstract transformer semantics is sound in relation to the concrete semantics if for all \(t \in \mathbb {T}\), \(\mu (\llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma })(t) \subseteq \gamma ^{\tilde{A}}(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })(t))\) and perfect if \(\mu (\llbracket \varphi \rrbracket ^{\textit{tra}}_{\varSigma })(t) = \gamma ^{\tilde{A}}(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })(t))\). By properties of abstract interpretation the following holds:

Theorem 1

Let \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })\) be an abstract transformer semantics for \(\varphi \). Then:

  • \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })\) is sound.

  • \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma })\) is perfect if \(\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma }\) is a perfect abstract Lola transformer fixed point equation and \(\tilde{A}\) is a perfect transformer abstraction.

This justifies that we can build a sound or perfect recurrent Lola monitor based on this abstract semantics. Consider the computation of the fixed point \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )\), where \(\top \) is the maximal element in \(\mathcal {T}_{\mathbb {D}_1,\dots ,\mathbb {D}_n}\) (i.e. the input monitoring stream tuple where no information about any input streams is available). The abstract transformer structure chosen for the abstract semantics has one significant advantage in terms of the computation of this fixed point: As soon as a single element in \(S = \mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )\) repeats, all entries of the structure (except the one for instant 0) are known. This is because if \(S(t) = S(t+k)\) for \(k > 0\), \(t \in \mathbb {T}\), then also \(S(t-1)=S(t+k-1)\) are equal (as no input information is available with \(\varSigma = \top \)). Therefore, all entries in S can be filled up to instant 1 without new computations being required. Hence, \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )\) can be computed back to front until the first instant at which \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )(t) = \mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )(t+k)\) occurs, and then the values at all instants are determined (except for the first entry). If the number of elements in the abstract domain \(\tilde{A}\) is bounded by c, (e.g. Boolean specifications) then after at most c iterations a loop in \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )\) is found. There are domains beyond Booleans for which finite perfect representations exist [13].

For abstract domains where \(|\tilde{A}|\) is unbounded one can use a widening operator [7, 8]. For example, using \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )(t) \bigtriangledown \mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )(t-1)\) instead of \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )(t-1)\) in the fixed point computation where the operator \(\bigtriangledown : \tilde{A} \times \tilde{A} \rightarrow \tilde{A}\) yields an over-approximation of the arguments by taking all unstable components of the abstractions directly to the extreme limits and thus enforcing a loop in \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_\top )\).

Based on these observations we build in the next section an efficient sound (or perfect) recurrent Lola monitoring algorithm.

5 Abstraction-Based Recurrent Lola Monitoring

We introduce our monitor construction based on the abstract structure from the previous section. At runtime the monitor receives information incrementally, so there is a sequence of extending input monitoring stream tuples \(\varSigma _0,\varSigma _1,\dots ,\varSigma _{t_{\textit{max}}}\) where in \(\varSigma _t\) all streams are fully unknown for instants larger than t and equal to \(\varSigma _{t-1}\) for instants smaller than t. Based on this observation we introduce the online monitoring algorithm Algorithm 1.

figure d

The algorithm first determines \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\), which is not dependent on inputs and can thus be computed statically as part of the monitor synthesis (as described at the end of the previous section). Then, at runtime the monitor receives iteratively the (possibly uncertain) inputs for the current instant t and computes \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _t})(t)\). By definition

$$ \mu (\llbracket \varphi \rrbracket _{\varSigma _t}^\sharp )(t) = \tau ^t_{\varphi ,\varSigma _t}(\mu (\llbracket \varphi \rrbracket _{\varSigma _t}^\sharp )(t+1)). $$

However, \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _t})(t+1) = \mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(t+1)\) because for all \(t' > t\) no inputs are available yet. This can be taken from the pre-computed \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\), and hence \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _t})(t)\) can be efficiently determined by applying \(\tau ^t_{\varphi ,\varSigma _t}\) once without requiring a full computation of the fixed point \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _t})\) from the end.

Then, the algorithm applies the abstracted configuration set from the previous step, stored in \(s^\sharp \) (of type A) on the computed transformer \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _t})(t)\) and assigns the result to \(s^\sharp \) again. In this manner \(s^\sharp \) represents the monitor state: the set of possible stream configurations at the current instant t. Note that \(s^\sharp \) is not available for \(t=0\) as there is no previous instant and thus also no monitor state. Yet \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma _0})(0)\) yields (by definition) a transformer which is independent of the predecessor argument. The concrete representation of \(s^\sharp \) is \(\gamma (s^\sharp )\) which consists of a set of possible value tuples for all streams, and serves as the monitor output. This output is perfect if and only if the chosen abstract domains A and \(\tilde{A}\) are perfect configuration set and transformer abstractions and the abstract transformer semantics is also perfect (see Theorem 1).

The application of an abstract transformer \(T \in \tilde{A}\) on a configuration set abstraction \(s \in A\) is technically defined as \( T(s) = \alpha ^A(\{\gamma ^{\tilde{A}}(T)(c) \mid c \in \gamma ^A(s)\}). \) Depending on the concrete abstractions there may be easier ways to achieve the application, for example using symbolic constraints, as we will see in the next section.

The size of \(s^\sharp \) may grow over time, so for a constant-size monitor it may be necessary to find an over-approximation. In conclusion the following holds:

Theorem 2

Let \(\varphi \) be a Lola specification let \(\varSigma _0,\) \(\varSigma _1,\) \(\dots ,\) \(\varSigma _{t_{\textit{max}}}\) be an extending sequence of input monitoring stream tuples where \(\varSigma _t\) contains the input readings for instant t. Algorithm 1 yields a sound recurrent Lola monitor and a perfect recurrent Lola monitor if \(\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma }\) is a perfect abstract Lola transformer fixed point equation and \(\tilde{A}\) is a perfect transformer abstraction.

6 Symbolic Recurrent Lola Monitoring

We now show a symbolic monitoring strategy under assumptions that tolerates uncertainty, for linear real arithmetic Lola specifications based on the general framework from the previous section. This theory supports real and Boolean streams, and the common Boolean operations, additions, constant multiplications and comparisons among real streams.

We will use the symbolic abstract domain to symbolically represent configurations and transformers. For convenience we use instant variables formed by stream names with the corresponding instant in the exponent of the symbolic variables. For example, \(s^3\) indicates the value of the event in stream s at instant 3. Abstractions of configuration sets only contain variables of a single instant, transformer abstractions those of the current and previous instant.

Example 2

Consider a specification with a single stream e of type real. The configuration set that states the value of e at instant 3 is between 90 and 92 (both inclusive) would be represented by the constraint \( 90 \le e^3 \le 92. \) To express that the value of e at instant 4 is at least 3 less than the value one instant before, that is, the transformer \(T(e) = \{e' \mid e' \le e - 3\}\), we could use \( e^4 \le e^3 - 3. \)

A perfect monitoring procedure requires—besides perfect abstract domains \(\tilde{A}\) and A—perfect symbolic constructions for the transformers \(\tau ^t_{\varphi ,\varSigma }\). This can be achieved in a straight forward manner as follows. To compute the transformer at instant t we take the symbolic representation of the subsequent transformer in the structure, and conjunct it with the symbolic instantiation of the specification at the current instant and the input readings for the current instant. This works because we required the input values of different instants to be independent of each other. For \(t=0\) or \(t=t_{\textit{max}} \) we use the default values.

Example 3

Consider again the specification from Fig. 1 (for this example without the parts added later like assumptions) and the situation where no inputs are known, which can be encoded by the symbolic constraint \(\textit{tt}\), and \(\mathbb {T}=\{0,\dots ,10\}\). The symbolic transformer \(\tau ^{t_{\textit{max}}}_{\varphi ,\top }\) for \(t_{\textit{max}} = 10\) is:

$$ \begin{array}{l} \mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(10) = \tau ^{10}_{\varphi ,\top } = (\textit{err} ^{10} = \lnot r_0^{10} \wedge (e^{10} < 5)) \wedge (\textit{Ferr} ^{10} = \textit{err} ^{10}) \end{array} $$

and \(\tau ^{9}_{\varphi ,\top }\) applied on \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(10)\) is

$$ {\begin{array}{l} \mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(9) = \tau ^{9}_{\varphi ,\top }(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(10)) = (\textit{err} ^{10} = \lnot r_0^{10} \wedge (e^{10} < 5)) \wedge (\textit{Ferr} ^{10} = \textit{err} ^{10})\\ \qquad \qquad \qquad \qquad \qquad \qquad \quad \; \wedge \;(\textit{err} ^{9} = \lnot r_0^{9} \wedge (e^{9} < 5)) \wedge (\textit{Ferr} ^{9} = \textit{err} ^{9} \vee \textit{Ferr} ^{10}). \end{array}} $$

Applying this strategy, the resulting formulas can grow and ultimately involve all instant variables from the current instant up to the trace end. Likewise instant variables from later instants are included, which are actually not allowed to be included in the transformers because their presence could prevent finding a repeated element in \(\tilde{A}\) and result in full unrolling of the specification. The fully computed transformers would express relations among all the instant variables to the stream end. In contrast, our online monitoring only preserves the relation among the variables for the current and previous instant, so we search for an alternative representation of the formula above which is equivalent w.r.t. the instant variables at the current and previous time points. This is equivalent to existentially quantifying over the variables to be removed and apply quantifier elimination if it can be used.

Example 4

Revisiting the previous example, real linear arithmetic quantifier elimination determines that \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(9)\) is

$$ \begin{array}{l} \mu (\!\llbracket \varphi \rrbracket ^{\sharp }_{\top })(9) = \exists r_0^{10},\textit{err} ^{10},\textit{Ferr} ^{10},e^{10}. (\textit{err} ^{10} = \lnot r_0^{10} \wedge (e^{10} < 5)) \; \wedge \\ \qquad \quad (\textit{Ferr} ^{10} = \textit{err} ^{10}) \wedge (\textit{err} ^{9} = \lnot r_0^{9} \wedge (e^{9} < 5)) \; \wedge (\textit{Ferr} ^{9} = \textit{err} ^{9} \vee \textit{Ferr} ^{10}) \\ \,\,\,\,\,\,\, = (\textit{err} ^{9} = \lnot r_0^{9} \wedge (e^{9} < 5)) \wedge (\textit{err} ^{9} \rightarrow \textit{Ferr} ^{9}) \end{array} $$

Following this strategy for \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(8)\):

$$ \begin{array}{l} \mu (\!\llbracket \varphi \rrbracket ^{\sharp }_{\top })(8) = \exists \lnot r_0^{9}, \textit{err} ^{9},\textit{Ferr} ^{9},e^9. (\textit{err} ^{9} = \lnot r_0^{9} \wedge (e^{9} < 5)) \; \wedge \\ \qquad \quad (\textit{err} ^{9} \rightarrow \textit{Ferr} ^{9}) \wedge (\textit{err} ^{8} = \lnot r_0^{8} \wedge (e^{8} < 5)) \; \wedge (\textit{Ferr} ^{8} = \textit{err} ^{8} \vee \textit{Ferr} ^{9}) \\ \,\,\,\,\,\,\, = (\textit{err} ^{8} = \lnot r_0^{8} \wedge (e^{8} < 5)) \wedge (\textit{err} ^{8} \rightarrow \textit{Ferr} ^{8}) \end{array} $$

Thus \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(9)\) and \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(8)\) are (modulo instant variable timestamps) equal to each other and consequently also to \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(7), \dots , \mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })(1)\). Hence, after three computation steps \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\) is fully computed, independent of the concrete \(t_{\textit{max}} \) (except for the entry at instant 0).

If the specification contains assumptions, we also add \(\dots \mathbin {\wedge } \varLambda ^t\) to each symbolic transformer. Unfortunately, quantifier elimination does not guarantee to reach a stabilized formula as above. Therefore, we propose the following three stage strategy for the computation of the initial fixed point, which may ultimately lead to an over-approximation of \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\):

  1. 1.

    Compute the elements of \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\) from back to front applying quantifier elimination for k steps.

  2. 2.

    If no repeating entry is found for l steps, the elements of \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\top })\) are determined but besides variables of future instants all real variables are eliminated. For the current instant real variables’ maximal and minimal bounds are determined based on the computed symbolic representation and added to the final symbolic representation (see [19]).

  3. 3.

    If still no repeating element is found, the strategy is applied again but with widening [7] on the bounds of two subsequent instants interval. For example, let [ab] be the previously computed interval and \([a',b']\) the new one. The lower widened interval bound is \(-\infty \) if \(a' < a\) and a otherwise. Dually, the upper widened interval bound is \(\infty \) if \(b' > b\) and b otherwise.

As all constraints over a fixed number of Boolean variables can be represented in a formula of constant length and the bounds of all real variables either stabilize or are brought to \(\pm \infty \) by widening, it is guaranteed that a repeating element will be found in the third stage. Note that eliminating real variables and replacing their constraints with bounds leads to an over-approximation. The resulting transformer and monitor are still sound but not necessarily perfect.

For the monitoring we finally recompute \(\mu (\llbracket \varphi \rrbracket ^{\sharp }_{\varSigma ^t})(t)\) for each timestamp t. We do this analogously to the initial fixed point computation before, but also add the new input constraints \(\alpha (\varSigma ^t) \backslash \alpha (\varSigma ^{t-1})\) (i.e. the input readings of the current instant).

When it comes to the application of the computed transformer to the current monitor state we can simply conjunct the constraints of the transformer and the current monitor state and again use quantifier elimination to eliminate the variables from the previous instant.

Example 5

Take again the transformer \(\tau = (e^4 \le e^3 - 3)\) from Example 2 and the monitoring state \(s^\sharp = (90 \le e^3 \le 92)\) from above, we get \( \tau (s) = \exists e^3. (e^4 \le e^3 - 3) \wedge (90 \le e^3 \le 92). \) After application of quantifier elimination this would lead to state \( s^\sharp = e^4 \le 89. \)

If the monitor state grows too large we can also apply the second stage of the above strategy to reduce its size at the cost of making the monitor state less precise. As a further optimization, note that from the first fixed point (for \(\varSigma = \top \)), we only need the relation between those variables that are referenced by +1 offsets in the specification. Therefore, during quantifier elimination for the initial fixed point we can also remove all variables from the current instant which are not referenced in this way.

7 Empirical Evaluation

We developed a prototype for symbolic recurrent Lola monitoring in Scala using Z3 [28] as backend solver for symbolic reasoning and quantifier elimination. We evaluated our tool on three case studies running on a 64-bit Linux machine with an Intel Core i7-1365U CPU and 32GB of RAM.

Path Planning. The first case study examines a variation of the vacuum cleaning robot from Fig. 1. The example was extended such that the output does not only specify whether a room can be safely entered but also with how much surplus or missing energy. This information could then be used to control the robot’s behavior, e.g. switching to a power-saving mode, showing the advantages of a monitoring approach which is able to compute richer verdicts than just Booleans.

We analyzed the monitor’s synthesis time (i.e. the time for the computation of the initial fixed point), and the monitor time per instant at runtime for a variable number of rooms by simulating a random walk according to the monitor’s output. In this case study, the initial semantics could be fully determined without widening, as a repeating symbolic transformer element was found after a few computations. As Fig. 2a shows, the synthesis time grows non-linearly. This is because the backwards calculation becomes more expensive with longer paths. This can be remedied by simplifying the symbolic representation of formulas during computation. However, Z3 is rather optimized for satisfiability checks but not for simplifying symbolic constraints. We will explore the benefits of specialized simplifiers and further optimizations for reducing the synthesis time as future work.

More important than synthesis time is the execution time of the monitors. The average computation time per instant during the monitor execution, measured with different degrees of induced uncertainty, is shown in Fig. 2b. Runtime increases when uncertainty is introduced but the time-per-event is still small (384 ms) in the worst case.

Fig. 2.
figure 2

a: Monitor synthesis per number of rooms. Synthesis time in seconds ( ) and number of computed states ( ). b: Avg. runtime (ms) per instant for different room numbers. Full certain ( ), 30% noisy ( ) and 15% entirely unknown ( ).

Collision Avoidance. In the second case study a robot uses a Lola monitor to navigate through an area with obstacles. The robot receives a set of waypoints from the user and tries to follow them while avoiding the obstacles. The monitor receives as inputs the distance \(\textit{dist}\) to the closest obstacle in front of the robot, as well as its leftmost and rightmost points \(\textit{left}\) and \(\textit{right}\). The monitor outputs the possible steering angles to avoid collisions in the future (see Fig. 3a). Assumptions define parameters like the maximum possible steering angle and the bounding box of the robot (see b and d in Fig. 3a).

Fig. 3.
figure 3

a: Collision avoidance (scheme). b: Screenshot of the simulation in Gazebo.

The study was qualitatively evaluated by integrating our Lola monitoring tool with the robot operating system ROS [30] running on a turtlebotFootnote 1 inside the simulation environment Gazebo [23] (see Fig. 3b). The robot follows a user defined path, periodically calling the monitor for the closest obstacle in front obtaining safe steering angles, from which the robot chooses the one

Fig. 4.
figure 4

Average runtime (ms) for uncertainty margins from 0% to 30%.

closest to the defined path. The monitor was able to steer the robot without collision with an uncertainty margin of up to 30%. We additionally extracted execution traces and evaluated the performance of the monitor offline. Figure 4 shows the runtime per instant, which increases with growing input uncertainty due to the increasing complexity of the constraint states.

Program Monitoring. In the third case study we use our approach for traditional program monitoring. An excerpt of the monitored program is shown below on the left.

figure j

At the end of the program we wanted to ensure that the value of variable y which is previously computed in a while loop does exceed 15. We have created a Lola specification which receives the current variable values as input streams and the current program line. Furthermore the program behavior itself was encoded in a straight-forward manner as assumption in the Lola specification. With its anticipation capabilities the monitor was able to compute legal values for the variables at certain program positions s.t. the assertion at the end is satisfied. Thus, it was able to detect program failures at an early stage during program execution.

Since the valid variable values depend on the number of while loop executions in the program (and thus the remaining trace length), the initial transformer semantics computation of our approach did not find a repeating transformer. Consequently the widening strategy described above has been applied to yield a sound recurrent Lola monitor for the specification. In the particular example however the simple interval widening was still able to capture that before entering the while loop variable x has to be at least 8, yet some other variable connections have been over-approximated. Yet, when in line 1 an input was entered which ultimately lead to \(\texttt {x} < 8\) in line 3 the monitor was able to detect the failure right there. Altogether this provides an illustrative example how the approach from this paper could be used for a mixture of static and dynamic program analysis, which in a large scale however would require more sophisticated widening techniques than in the current implementation.

8 Conclusion

In this paper we have studied general anticipatory monitoring of Lola specifications under uncertainties and assumptions. We have introduced a hierarchy of monitoring semantics and presented an abstraction based framework for monitoring, from which we developed a general sound or perfect online monitoring algorithm for Lola. This algorithm considers future continuations of the received input, provided an abstraction of stream data values. Finally, we have presented an instantiation of this algorithm based on a symbolic representation. and evaluated the approach in three practical scenarios. Due to Lola ’s universality, our theory can also serve a general framework for anticipatory monitoring of synchronous RV formalisms.

Future work includes a more efficient implementation, especially improving the simplification of the symbolic constraints applied during monitoring, and applications to other Lola fragments beyond linear arithmetic. We also plan to extend the approach to infinite traces and asynchronous SRV formalisms.