1 Introduction

Symbolic Model Checking [16] is a powerful formal verification technique for proving temporal properties of transition systems (a.k.a. models) represented by logical formulae. In the case of Linear Temporal Logic (LTL) [15], the properties can be translated into symbolically represented \(\omega \)-automata, which is then conjoined with the model and proved by search-based techniques that exhaustively analyze the infinite traces of the system [7]. Runtime Verification (RV) [10, 13] on the other hand, is a lightweight verification technique for checking if a given property is satisfied (or violated) on a finite trace of the system under scrutiny (SUS). In general, LTL-based RV problems can be resolved by automata-based [1], rewriting-based [17], or rule-based [11] approaches.

In this paper, we present a new tool called NuRV, an extension of the nuXmv [4] model checker for LTL-based RV. To the best of our knowledge, this is the first time that a model checker is directly modified (or extended) into a runtime monitor (or monitor generator). It is natural to do so, as nuXmv has already provided the needed infrastructure, such as a symbolic translation from LTL to \(\omega \)-automata, an algorithm for computing the “fair states” (those leading to infinite paths), together with an interface to BDD library [3] based on CUDD 2.4.1 [18].

For the monitoring algorithm implemented in NuRV (c.f. [6] for more details), our start point is the automata-based approach [1] based on \(\mathrm {LTL}_3\), implemented symbolically. Suppose the monitoring property is \(\varphi \), we first run the LTL translations twice, on \(\varphi \) and \(\lnot \varphi \), to get two symbolic automata \(T_\varphi \) and \(T_{\lnot \varphi }\), resp. Then an input trace u is synchronously simulated on \(T_\varphi \) and \(T_{\lnot \varphi }\), by repeatedly computing forward images w.r.t. all fair statesFootnote 1. For each input state of u, we get two sets of belief states, \(r_\varphi \) and \(r_{\lnot \varphi }\). Based on their emptinesses, the monitor returns one of the following verdicts:

  • conclusive true (\(\top \)), if \(r_\varphi \ne \emptyset \) and \(r_{\lnot \varphi } = \emptyset \). \(\varphi \) is verified for all future inputs;

  • conclusive false (\(\bot \)), if \(r_\varphi = \emptyset \) and \(r_{\lnot \varphi } \ne \emptyset \). \(\varphi \) is violated for all future inputs;

  • inconclusive (?), if \(r_\varphi \ne \emptyset \) and \(r_{\lnot \varphi } \ne \emptyset \). In this case, the knowledge of the monitor is limited by the finiteness of u.

Besides the property \(\varphi \), the monitoring algorithm takes in input a model K of the SUS. This is used to declare the variables in which the properties are expressed, but more importantly to define some constraints on their temporal evolution, which represent assumptions on the behavior of the SUS. By considering only (infinite) traces of K, the above algorithm may give more precise outputs (turning ? into \(\top /\bot \)). This is obtained by using \(K \otimes T_\varphi \) (the synchronous product of K and \(T_\varphi \)) and \(K \otimes T_{\lnot \varphi }\) instead of \(T_\varphi \) and \(T_{\lnot \varphi }\), respectively. This coincides with [12], where the resulting monitor is called to be predictive.

The model is used by NuRV in different novel ways. First of all, there is the possibility that \(u \notin L(K)\), because the model may be wrong, or it only captures a partial knowledge of the SUS, or due to unexpected faults. In this case we have \(r_\varphi = r_{\lnot \varphi } = \emptyset \) in above algorithm, and we naturally let the monitor returns a fourth verdict called out-of-model (\(\times \)). This is why we call K an assumption, and the two verdicts \(\top /\bot \) are only conclusive under assumptions, thus renamed to \(\top ^\mathrm {a}/\bot ^\mathrm {a}\). This extended RV approach may be called assumption-based. In particular, if one only cares whether the SUS always follows its model, we can use a dummy LTL property \(\mathrm {true}\) in above procedure, so that \(K \otimes T_{\lnot \varphi }\) is always empty, and the monitor will output either \(\top ^\mathrm {a}\) or \(\times \), indicating whether \(u \in L(K)\). This application coincides with model-based RV [19].

Second, the above monitoring algorithm directly supports partially observable traces, i.e. variables appeared in the monitoring property are not (always) known in each state of the input trace. This is because the symbolic forward-image computations do not require full observability—less restrictive inputs result to coarser belief states. Partial observability becomes more useful under assumptions, as an assumption may express a relation between observable and unobservable variables of the SUS.

Third, NuRV supports resettable monitors, i.e. it can evaluate an LTL property at arbitrary positions of the input trace. This idea was inspired by the observation that, in \(r_\varphi \) and \(r_{\lnot \varphi }\), all variables (some are generated by the LTL translations) related to the present and the past have the same values, while all variables related to the future have opposite values. There is no easy way to distinguish these two groups of variables. However, by taking \(r_\varphi \cup r_{\lnot \varphi }\) we smartly get a new belief state which represents the history of the system after a run given by the input trace seen so far. If we restart the monitor algorithm at state i using this history as the new initial condition of K (also with a reduced version of initial conditions of \(T_\varphi \) and \(T_{\lnot \varphi }\)), the new monitor is essentially evaluating for \(|u| > i\), with the underlying assumptions taken into account. This is again an orthogonal feature, but having an assumption makes resetting of the monitor more interesting as the assumption evolves to take into consideration the history of the system.

Furthermore, NuRV can synthesize the symbolic monitors into explicit-state monitor automata and then generate them into standalone monitor code in various programming languages (currently we support C, C++, Java, and Common Lisp). Besides, it is possible to dump the monitor automata into SMV modules, which can be further analyzed in nuXmv for their correctness and other properties.

The rest of this paper is organized as follows: In Sect. 2 we describe its architecture and functionalities. Some use case scenarios (as running examples) are given in Sect. 3. Section 4 shows some experimental evaluation results. Finally, we conclude the paper in Sect. 5 with some directions for future work.

2 Architecture and Functionalities

NuRV implements the Assumption-based Runtime Verification (ABRV) with partial observability and resets described in [6]. Monitoring properties are expressed in Propositional Linear Temporal Logic (LTL) [15] with both future and past temporal operators. For each input state, the monitor outputs one of four verdicts in \(\mathbb {B}_4 {\,\dot{=}\,}\{\top ^\mathrm {a}, \bot ^\mathrm {a}, ?, \times \}\). As a program, NuRV takes an assumption (as SMV model), some LTL properties and input traces, and output the verification results or some standalone monitor code, according to a batch of commands. The reader may refer to [6] for the formal definition of the LTL semantics and the related RV problems.

Fig. 1.
figure 1

The internal structure of NuRV

2.1 Architecture of NuRV

The internal structure of NuRV is shown in Fig. 1. The monitor construction starts from the modular description of a model K (used as assumptions in ABRV) and a set of LTL properties \(\varphi _1, \ldots , \varphi _n\). The model is used also to declare the variables (and their types) in which the LTL properties are expressed, thus the alphabet of the input words of the monitors. NuRV has inherited nuXmv’s support of hierarchical models and rich variable types (such as bound integers and arrays), all input data (models, properties and traces) are flattened and boolean encoded before going to further steps. The Model Construction component generates (from the model) a BDD-based representation of the Finite State Machine (FSM), which is then used in the monitor construction step, together with the monitoring property, to produce another BDD-based FSM representing the symbolic monitor. The resulting monitor can be used in two ways: (1) as an online/offline monitor running inside nuXmv, accepting finite traces incrementally, outputting verification results for each input states. (2) as the input of the Monitor Generator component, resulting into standalone monitor code. From the end-users’ point of view, NuRV extends nuXmv with the following new commands:

  1. 1.

    build_monitor: build the symbolic monitor for a given LTL property;

  2. 2.

    verify_property: verify a currently loaded trace in the symbolic monitor;

  3. 3.

    heartbeat: verify one input state in the symbolic monitor (online monitoring);

  4. 4.

    generate_monitor: generate standalone monitors in a target language.

The commands build_monitor and verify_property together implemented the offline monitoring algorithm described in [6]. The command generate_monitor further generates explicit-state monitors in various languages from the symbolic monitor built by the command build_monitor. These commands must work with other nuXmv commands [2] to be useful.

2.2 Structure of Explicit-State Monitors

The Monitor Generator components internally generate monitor code in two steps: (1) generating explicit-state monitor automata from the symbolic monitor; (2) converting monitor automata into code in specific languages. NuRV can generate three levels of explicit-state monitors:

L1 :

The monitor synthesis stops at all conclusive states;

L2 :

The monitor synthesis explores all states;

L3 :

The monitor synthesis explores all states and reset states.

Fig. 2.
figure 2

Explicit-state monitors of \(p\,\mathbf {U}\,q\) (assuming \(p\ne q\)) (L1–L3)

A sample explicit-state monitor for LTL property \(p\,\mathbf {U}\,q\) generated by NuRV is shown in Fig. 2. The monitor is generated under the assumption that either p or q is true in the input. The monitor starts at location 1, and returns ? if the input is \(p \wedge \lnot q\) until it received \(\lnot p \wedge q\) which has the output \(\top ^\mathrm {a}\) (Y). The L1 monitor has no further transition at locations associated with conclusive verdicts (\(\top ^\mathrm {a}\) or \(\bot ^\mathrm {a}\)), since it can be easily proved that ABRV-LTL monitors are monotonic if the assumption is always respected by the input trace. The L2 monitor contains all locations and transitions, thus it may return \(\times \) even after the monitor reached conclusive verdicts. The L3 monitor additionally contains information for the resets: in case the monitor is reset, the current location will first jump to the location indicated in the bracket \(\texttt {[]}\), of current location, then goes to next location according to the input state. However, in the above monitor all reset locations are just the initial location (1), this is mostly because the assumption is an invariant property and the LTL property does not have any past operators.

Standalone monitor code are literally translated from these monitor automata (FSMs). The correctness of monitors in C, for instance, comes indirectly from the correctness of the symbolic algorithm and mode checking on SMV-based monitors.

2.3 API of Generated Code

NuRV currently supports monitor code generation into five languages: C, C++, Java, Common Lisp and SMV. The structure of monitor code is simple yet efficient: it simply mimics the simulations of deterministic FSMs.

The monitor code generated (in C, for example) has the following signature:

figure a

The function name (monitor here) is given by the user. It takes three parameters: (1) state: an encoded long integer representing the current input state of the trace, (2) reset, an integer representing the possible reset signal, and (3) current_loc: a pointer of integer holding the internal state of the monitor. It is caller’s responsibility to allocate an integer and provide the pointer to the monitor (otherwise the function returns −1 indicating invalid locations), and this is actually the only thing to identify a monitor instance. The sole purpose of the function is to update *current_loc (the value behind the pointer) according to state and reset and to return a monitoring output. NuRV supports two different encodings for state:

  1. 1.

    Static partial observability: state denotes a full assignment of the observables, encoded in binary bits: 0 for false (\(\bot \)), 1 for true (\(\top \));

  2. 2.

    Dynamic partial observability: state denotes a ternary number, whose each ternary bit represents 3 possible values of an observable variable: 0 for unknown (?), 1 for true (\(\top \)) and 2 for false (\(\bot \)).

Note that the symbolic monitoring algorithm can take in general input states expressed in Boolean formulae (e.g., if the observables are p and q, our monitor may take an input state “\(p\;\mathrm {xor}\;q\)”, either p or q is true but not both), but this is not supported by the generated code.

BDD operations are implemented by the BDD manager. Their performance strongly depends on the variable ordering used in the BDD construction. This can be controlled by setting an input_order_file in nuXmv. The input of generated monitor code requires an encoding of BDDs into long integers according to this file. This encoding is done from the least to the most significant bit. For instance, if the observables are p and q with the same order, an binary encoding for the state \(\{p = \top , q = \bot \}\) would be \((01)_2 = 1\), and a ternary encoding for the same state would be \((21)_3 = 7\). The design purpose is to make sure that the comparison of two encoded states can be as fast as possible. The signatures of monitors in other languages are quite similar, except that the parameter current_loc can be put inside C++/Java classes as an member variable, and each monitor is an instance of the generated monitor class.

3 Use Case Scenario

Now we briefly demonstrate the process of generating a monitor for LTL properties \(\varphi _0 = p\,\mathbf {U}\,q\) and \(\varphi _1 = \mathbf {Y} p \vee q\), assuming \(p\ne q\). A batch of commands shown in Fig. 3 does the work (also c.f. Fig. 4 for the contents of two helper files).

Fig. 3.
figure 3

The batch commands

The command go builds the model from the input file disjoint.smv which defines two Boolean variables p and q, together with the invariant \(p \ne q\).

The generated monitors M0.c and M1.c (together with their C headers) are under the full observability of p and q. The variable ordering is given by the file default.ord, in which each line denotes one variable in the model.

The simplest way to use the generated monitor, M0 for instance, is to declare an integer and call the monitor function like this: (e.g. when monitoring a C program linked with the generated monitor code, p and q may denote two assertions in the program)

figure b
Fig. 4.
figure 4

disjoint.smv and default.ord

There is no need to initialize the integer monitor_loc as the first M0 call with a value 1 will also do the monitor initialization. (Actually it just set monitor_loc to 1, we may call it a hard reset.) The first function call returns 0 indicating ABRV-LTL value ? (unknown); the second call returns 1 indicating \(\top ^\mathrm {a}\) (conclusive true).

Fig. 5.
figure 5

Offline monitoring in NuRV

For offline monitoring, there is no need to call generate_monitor in above batch command. Suppose a trace \(u=p\,p\,p\,q\,q\,q\) has been loaded (by read_trace), the command verify_property verifies the trace against the symbolic monitor of \(\varphi _0\), shown in Fig. 5 (here “−n 0” denotes the first monitor, and 1 denotes the first loaded trace).

It is also possible to verify just one input state by heartbeat (online monitoring). It has a similar interface with verify_property, just the trace ID is replaced by a single state expressed by a logical formula (as a string), e.g. "p & !q".

4 Experimental Evaluation

We have done some comparison testsFootnote 2 between NuRV and the latest release of RV-Monitor [14]. To show the feasibility and effectiveness of RV tools, we tried to generate LTL monitors from a wide coverage of practical specifications, i.e. Dwyer’s LTL patternsFootnote 3 [8]. The purpose is to generate the same monitors from NuRV and RV-Monitor (rvm) and compare their performances and other characteristics. All these patterns are expressed in six Boolean variables (pqrst and z). RV-Monitor is event-based, i.e. the alphabet is the set of these variables instead of their power set. This means our monitors can be built under the assumption that all six variables are disjoint.

Table 1. Eight long formulae from Dwyer’s patterns

Unfortunately, RV-Monitor (rvm) fails in generating monitors from eight long formulae (Pattern 13, 14, 39, 43, 44, 49, 53 and 54), shown in Table 1. Also it does not generateFootnote 4 monitors from all ten safety properties (Pattern 5, 7, 22, 25, 27, 40, 41, 42, 45 and 50). Eventually we got only 37 monitors out of 55 LTL patterns, and we confirmed that, whenever rvm monitors report violations, our monitors behave the same. Our 55 monitors were quickly generated in 0.467 s (MacBook Pro with Intel Core i7 2.6 GHz, 4 cores) using a single core, while the 37 rvm monitors were generated in 78.619 s on the same machine using multiple cores.

Fig. 6.
figure 6

Performance of generated Java monitors on \(10^7\) states.

We observed that rvm monitors does not report further violations once the first violation happens, and goes into terminal states. To get visible performance metrics we chose to reset all monitors once a violation is reported. Also, to prevent extra performance loss in rvm monitors by creating multiple monitor instances [5], we have used a single trace (stored in a vector) with \(10^7\) random states. For each of the 37 LTL patterns, we recorded the time (in ms) spent by both monitors (running in the same Java process), the result is shown in Fig. 6. Our monitors (in Java) have shown a constant-like time complexity (approx. 250 ms), i.e. the time needed for processing one input trace is almost the same for all patterns. This reflects the spirit of automata-based approaches. Rvm monitors vary from 500 ms to more than 6 s, depending on the number of resets.

5 Conclusions and Future Work

We presented NuRV, a nuXmv extension for Runtime Verification. It supports assumption-based RV for propositional LTL with both future and past operators, with the supports of partial observability and resets. It has functionalities for offline and online monitoring, and code generation of the monitors in various programming languages. The experimental evaluation on standard LTL patterns shows that NuRV is quite efficient in both generation and running time. In the future, we plan to participate in the RV competition to broaden the tool comparison and to extend the monitor specification language beyond the propositional case.