Runtime verification of embedded realtime systems
 2.5k Downloads
 12 Citations
Abstract
We present a runtime verification framework that allows online monitoring of pasttime Metric Temporal Logic (ptMTL) specifications in a discrete time setting. We design observer algorithms for the timebounded modalities of ptMTL, which take advantage of the highly parallel nature of hardware designs. The algorithms can be translated into efficient hardware blocks, which are designed for reconfigurability, thus, facilitate applications of the framework in both a prototyping and a postdeployment phase of embedded realtime systems. We provide formal correctness proofs for all presented observer algorithms and analyze their time and space complexity. For example, for the most general operator considered, the timebounded Since operator, we obtain a time complexity that is doubly logarithmic both in the point in time the operator is executed and the operator’s time bounds. This result is promising with respect to a selfcontained, noninterfering monitoring approach that evaluates realtime specifications in parallel to the systemundertest. We implement our framework on a Field Programmable Gate Array platform and use extensive simulation and logic synthesis runs to assess the benefits of the approach in terms of resource usage and operating frequency.
Keywords
Runtime verification Embedded realtime systems Pasttime logics Online monitoring1 Introduction
Rigorous verification strategies are especially vital for the domain of safetycritical embedded realtime systems [48] where systems often do not only need to comply with a set of functional requirements but also—equally important—with tight timing constraints. Correct behavior of these systems is defined by the sequence of data they produce—either internally or at their physical outputs—complemented with their temporal behavior. The key idea behind formal verification techniques such as model checking [6, 22] is to exhaustively check all executions of a structure that is related to an implementation and its environment against given requirements, the latter of which are often formalized in terms of a temporal logic. Exhaustive analysis of programs, however, often suffers from practical infeasibility (due to state space explosion [21]) and/or theoretical impossibility (due to undecidability results).
In runtime verification [9], observers are synthesized to automatically evaluate the current execution of a systemundertest (SUT), typically from a formal specification in a logic that is suitable to cover certain forms of realworld specifications. The onthefly nature of runtime verification can be coupled with costly overhead [10, 56, 71]. Some mitigated overhead by reducing instrumentation points [34]; others ported the system and/or the observers to a more powerful architecture, such as database systems [8]. These artifacts of runtime verification are not compatible with embedded realtime systems running on ultraportable hardware with power and performance limitations [65].
To evaluate specifications, runtime verification depends on observations of the state of the SUT. These observations are referred to as events and are input to the observer. However, the SUT’s state typically is not directly observable.

Source code instrumentation of highlevel languages can only capture events that are accessible from within the instrumented software system. Embedded systems [59] often include both hardware and mechanical parts; events from those might go unnoticed for an instrumenting runtime verification approach.

The timing behavior of the SUT is altered by instrumentation [23, 34]. The additional runtime overhead may drastically impact the correctness of a heavyloaded realtime application with tight deadlines. The same applies to memory consumption of resource constrained systems. The relevance of this argument is supported by the fact that restricted architectures are often used in critical environments[12, 33, 66], such as in nuclear power plants [28] and spacecrafts [30, Chap. 3].

Instrumentation may make recertification of the system onerous (e.g., systems certified for civil aviation after DO178B [73]).

In its present shape, runtime verification often analyzes the correctness of highlevel code. However, to show that a highlevel specification is correctly reproduced by the target system, it is further necessary to show the correctness of the translation of the highlevel code into executable code, i.e., the compiler. Despite recent breakthroughs [52, 53], only few verified compilers are used in practice and flaws introduced by compilers [31, 55, 81] may remain undetected by existing approaches.

Instrumentation at binary code level may circumvent the process of establishing correctness of the compiler. However, binary instrumentation is incomplete as long as a sound reconstruction of the control flow graph is not obtained from the binary. Despite being an active area of research [7, 35, 46, 67], generating sound yet precise results remains a challenge.
There exist, however, systems and applications [80], where the relevant events can be observed without the need to infuse additional functions into the highlevel code. Consider, for example, an implementation of a network protocol, where the task is to check the correctness of data flow between two network nodes. It appears natural to place an additional (passive) node in the network that collects events sent over the network, rather than instrumenting the highlevel code of the network nodes. The strength of an approach like this is that collecting of events is nonintrusive, at least, as long as the additional node is passive and does not actively participate in the communication. It is important to observe that information exchange among systems is often performed by standardized interfaces. This is especially the case for embedded realtime systems, at various levels of detail [59, Chap. 3]. For certain systems, wiretapping is the only option left to gain information of the state of the system, for example, if the design includes proprietary hardware or software components.
 Standalone

The runtime verification framework should not only be deployed during the testing phase of the product but also after the product is shipped. Therefore, it should operate in a selfcontained way and not depend on a powerful host computer that executes the observer.
 Nonintrusive

The resulting observers should be efficient enough to not alter the timing requirements of the SUT. From an algorithmic viewpoint, observers with an apriory known execution time are of utmost importance so as to statically determine upper bounds of the execution time of the observer. From an implementation point of view, we need to provide measures to passively observe events from the SUT.
 Timed

To support correctness claims that involve timed properties, the framework should support expressive logics to formalize not only functional but also realtime requirements.
 Reconfigurable

For the testing phase, the framework should be reconfigurable without requiring to resynthesize the whole hardware design, which may take dozens of minutes to complete, for example when targeting an Field Programmable Gate Array (FPGA) platform.
2 Contributions and roadmap
 (a)
We present online observer algorithms that allow one to verify whether a pasttime metric temporal logic (ptMTL) formula holds at (discrete) times \(n \in\mathbb{N}_{0}\). The algorithms make use of basic operations only and are stated in a way that allows for a direct implementation in hardware, that can run without a host computer. By that our observers fulfill the timed and stand alone requirements.
 (b)
We formally prove the observers’ correctness and derive bounds on their time complexity in terms of gate delays and their space complexity in terms of required memory bits. With n being the time an observer algorithm is executed and J a nonempty interval we obtain, for the most general of the presented observer algorithms, the ptMTL Since operator φ _{1} S _{ J } φ _{2}, a time complexity of \(\mathcal{O}(\log_{2}\log _{2}\max(J \cup\{n\}))\), only. The observer’s space complexity is dominated by the size of a list it needs to maintain. We show that the list’s space complexity is at most \(2\lceil\log _{2}(n)\rceil\cdot(2\max(J)\min(J)+2)/(2+\operatorname{len}(J))\), where \(\operatorname{len}(J)=\max(J)\min(J)\). Both complexity results, as well as the fact that our algorithms refrain from loops and recursions and build on simple operations only, enable applications of our runtime verification framework on resource limited platforms that require predictable timing and memory consumption.
 (c)
We explain how to derive noninstrumenting efficient realizations of the proposed observer algorithms in hardware. The resulting hardware profits from the simplicity and low complexity of our highlyparallel observer algorithms. In contrast to instrumentationbased runtime verification techniques for software systems our observers are well suited to supervise hardware components. By that, in combination with (b), our observers fulfill the nonintrusive requirement. Although our algorithms are tailored for a hardware implementation, the observers can simply be adopted to run in software too. Reconfigurability of our observers is achieved by, instead of hardwiring the observers inputs and outputs according to their parse tree, letting a programmable, specifically tailored microprocessor control a pool of observers.
 (d)
To evaluate the effectiveness of our approach, we report on a throughout study of simulation traces and synthesis results of a fullfledged hardware implementation of the presented observer algorithms and discuss the scalability of our approach.
With regard to the contributions above, (a) and (b) are an extension of our work we presented at the International Conference on Runtime Verification [71], including detailed correctness proofs for our algorithms and (c) and (d) are unique contributions of this article. Contribution (c) builds on our previous work [69], where we presented a microprocessor designed to evaluate ptLTL specifications in a softwareoriented fashion. Using this approach to check ptMTL specifications, however, requires a costly (cf. Sect. 3.3) rewriting to an equivalent ptLTL specifications. Instead, we show how to map the building blocks of our ptMTL observer algorithms into efficient hardware units. This enables our microprocessor to natively evaluate ptMTL specifications in realtime. Both (c) and (d) help us to put the presented realtime observer algorithms into industrial practice.
The contributions of this article are presented as follows. First, Sect. 3 is a primer on temporal logics, which sets the scene for the monitoring algorithms stated in Sect. 4. Section 5 details the key structures of the hardware design and Sect. 6 reports on experimental evidence. We continue with a survey of related work in Sect. 7 and conclude in Sect. 8.
3 Logics for runtime verification
We briefly summarize the temporal logics pasttime linear temporal logic (ptLTL) and pasttime metric temporal logic (ptMTL) which are used to specify properties in our framework. Both allow one to specify safety, pasttime properties over executions. For further details, we refer the reader to more elaborate sources such as [2, 13, 32, 42, 51, 57].
3.1 Pasttime linear temporal logic
3.2 Pasttime metric temporal logic
Example
can be expressed in ptMTL. The above property, e.g., can be formalized by:“If the system leaves the idle mode, it has received an according signal in the past 50 clockcycles.”
3.3 Rewriting pasttime metric temporal logic to pasttime linear temporal logic
In a hardware implementation, one can make use of shiftregisters to store the relevant part of the execution path with regard to the truth values of φ _{1} and φ _{2}. We will proceed by a sample implementation making use of the equivalence above.
Example
It is important to observe that the chain of AND gates starting at ⊙^{0} φ _{1} introduces a gate propagation delay [44, Chap. 9] Δ on the signal that is proportional to b and delays the output of the verdict e ^{ n }⊨φ _{1} S _{[a,b]} φ _{2}. With a propagation delay δ _{AND} of a single AND gate of and an AND chain of length b−1, the total propagation delay equals to Δ=(b−1)×δ _{AND}. The chain becomes the critical path of the circuit and lowers the achievable operational frequency of the observer design. This effect can be alleviated by introducing a pipeline, however, not without the cost of additional memory and control logic.
This supports that rewriting ptMTL to ptLTL, albeit theoretically possible, is costly and thus infeasible in practice with an application in mind where the satisfaction relation is checked onthefly, i.e., in parallel to the SUT. Rewriting, however, may prove feasible when the observer is executed on a powerful host computer with a capable term rewriting engine at hand, as studied in [72].
4 Observer design for realtime properties
In the following, we discuss the formal design of online observer algorithms for specifications in ptMTL in a discrete time model. The design is inspired by the observers described in [11] and extends work on observers for ptLTL [42] which have been built in hardware [63, 68]. We first give a highlevel definition of the algorithms and turn to a hardware implementation in Sect. 5.
4.1 Decomposing a specification
 (i)
φ=true returns true.
 (ii)
φ=false returns false.
 (iii)
φ=σ, where σ∈Σ returns true if σ holds on s _{ n }, and false otherwise.
 (iv)
φ=φ _{1}•φ _{2} is true if e ^{ n }⊨φ _{1}•e ^{ n }⊨φ _{2}, where •∈{∧,∨,→}, and false otherwise.
 (v)
 (vi)
For φ=φ _{1} S _{ J } φ _{2}, we collect all times where φ _{2} was true in the past and since then φ _{1} remained true and store them in a list. At time n we check if there exists a time τ in the list such that n−τ∈J. If such a τ exists we return true, and false otherwise.
Running example
4.2 The invariant and exists previously operators
For example, Open image in new window expresses that whenever σ _{1} becomes true, σ _{2} holds at all 10 previous time units. For both Open image in new window and Open image in new window we present simplifications that yield space and timeefficient observers.
Invariant previously ( Open image in new window )
is transformed into ¬(true S _{[0,τ]} ¬φ) by (1). An observer for Open image in new window requires a single register Open image in new window with domain \(\mathbb{N}_{0} \cup\{ \infty \}\). Initially Open image in new window . Note that an actual implementation of this observer algorithm clearly must restrict itself to a bounded domain {0,1,…,N}∪{∞}, where N is chosen sufficiently large to cover the expected mission time of the system being analyzed. We will discuss implementation considerations of our observers in Sect. 5 and meanwhile assume unbounded domain registers.
Theorem 1
For all \(n\in\mathbb{N}_{0}\), the observer stated in Algorithm 1 implements Open image in new window .
Proof
Running example
Consider Open image in new window on the execution in Fig. 3. Initially, Open image in new window . At time 0, φ _{2} holds and thus Open image in new window . The predicate Open image in new window holds, the algorithm returns true and we have that Open image in new window . For similar arguments, at time 1, Open image in new window . At time 2, a Open image in new window transition of φ _{2} occurs and we have Open image in new window . Since predicate Open image in new window does not hold, we have that Open image in new window . For similar arguments, at time 3, Open image in new window . Since a Open image in new window transition of φ _{2} occurs at time 4, Open image in new window . Again, Open image in new window does not hold, thus, Open image in new window . The same is true for time 5, thus, Open image in new window . At time 6, ↑φ _{1} becomes true and since Open image in new window is true, we deduce e ^{6}⊨ψ. For times n′ prior to 6, (i.e., 0≤n′<6), the lefthand side of the implication of ψ does not hold. We immediately have that e ^{ n′}⊨ψ.
Exists previously ( Open image in new window )
From the equivalence Open image in new window , we can immediately derive an observer for Open image in new window from the observer for Open image in new window . The resulting algorithm can straightforwardly be implemented by checking for a Open image in new window (resp. Open image in new window ) transition of φ instead of a Open image in new window (resp. Open image in new window ) transition of ¬φ in line 2 (resp. line 5) and negating the output in line 8.
4.3 The invariant and exists within interval operators
We now present observers for the more general operators invariant within interval J (⊡_{ J }) and exists within interval J ( Open image in new window ). Instead of a register (such as Open image in new window in case of the observer for Open image in new window ), both observers require a list of time point pairs. Clearly, an efficient implementation of this list is vital for an efficient observer. In the following, we present several techniques so as to keep the list succinct, whilst preserving validity of the observer. For a list l, we denote by l its length, and by l[k], where \(k\in\mathbb{N}\), its kth element. We assume that elements are always appended to the tail of a list.
Invariant within interval (⊡_{ J } φ)
We will deduce the correctness of the observer stated in Algorithm 2 from the correctness of a generalized algorithm, presented in Sect. 4.4, obtaining:
Theorem 2
For all \(n\in\mathbb{N}_{0}\), the observer stated in Algorithm 2 implements e ^{ n }⊨⊡_{ J } φ.
Running example
Exists within interval ( Open image in new window )
From the equivalence Open image in new window , we can easily derive an observer for Open image in new window from the observer for ⊡_{ J } φ. As before, we obtain the observer by swapping Open image in new window and Open image in new window transitions and negating the output.
4.4 The since within interval operator
Theorem 3
For all \(n\in\mathbb{N}_{0}\), the observer in Algorithm 3 implements e ^{ n }⊨φ _{1} S _{ J } φ _{2}.
For the proof we introduce additional notation. For list l denote with l⋅T, the list resulting from adding element T to the tail of list l. Further denote with l ^{ n }, where \(n\in\mathbb{N}_{0}\), the state of Algorithm 3’s list l _{ S } in line 19 executed at time n. By \(\overline{l}^{n}\) we denote the set [0,n]∖⋃_{1≤k≤l}[l[k].τ _{ s },l[k].τ _{ e }+1). For example, if l ^{10}=((0,3),(5,8)), then \(\overline{l}^{10} = \{4,9,10\}\). We first show that the following proposition holds:
Proposition 1
Consider Algorithm 3 without the feasibility check in line 8, i.e., replace this line with “if true then”. For the modified algorithm the following is correct: For all \(n\in\mathbb{N}_{0}\) and i≤n, \(i\in\overline{l}^{n}\) holds iff both e ^{ i }⊨φ _{2} and for all k,i<k≤n, e ^{ k }⊨φ _{1}.
Proof
The proof is by induction on \(n\in\mathbb{N}_{0}\).
Begin (n=0): Consider the four cases for φ _{1} and φ _{2}:
Case (i): Assume e ^{ n }⊨φ _{1} and \(e^{n} \not\models \varphi_{2}\). Then l ^{ n }=((0,∞)) and thus \(\overline{l}^{n} = \emptyset\). Since \(e^{n} \not\models\varphi_{2}\), the induction basis follows in this case.
Case (ii): Assume e ^{ n }⊨φ _{1} and e ^{ n }⊨φ _{2}. Then l ^{ n }=() and thus \(\overline{l}^{n} = \{0\}\). Since e ^{ n }⊨φ _{2}, the induction basis follows in this case.
Case (iii): Assume \(e^{n} \not\models\varphi_{1}\) and \(e^{n} \not \models \varphi_{2}\). The arguments are analogous to the arguments of case (i).
Case (iv): Assume \(e^{n} \not\models\varphi_{1}\) and e ^{ n }⊨φ _{2}. The arguments are analogous to the arguments of case (ii).
Step (n−1→n): Assume that the statement holds for n−1≥0. We will show that it holds for n, too. Thereby we consider the same cases (i) to (iv) as in the induction basis.
Case (i): We distinguish two cases for φ _{2}: a Open image in new window transition of φ _{2} (i.a) did, or (i.b) did not occur at time n.
In case of (i.a), l ^{ n }=l ^{ n−1}⋅(n,∞). Thus \(\overline{l}^{n} = \overline{l}^{n1}\). Since e ^{ n }⊨φ _{1} but \(e^{n} \not\models\varphi_{2}\), the induction step follows in this case.
In case of (i.b), l ^{ n }=l ^{ n−1}. By the algorithm, the last element in l ^{ n } must be of the form (n′,∞) with n′<n. Thus \(\overline{l}^{n} = \overline{l}^{n1}\). Again, the induction step follows in this case.
Case (ii): We distinguish two cases for φ _{2}: a Open image in new window transition of φ _{2} (ii.a) did, or (ii.b) did not occur at time n.
Now consider case (ii.a): If l ^{ n−1}=(), l ^{ n }=l ^{ n−1} holds, and thus \(\overline{l}^{n}=\overline{l}^{n1}\cup\{n\}\). Otherwise, the last element in l ^{ n−1}, say (n′,∞), with n′≤n, is replaced with (n′,n) in l ^{ n }. Again, \(\overline{l}^{n}=\overline{l}^{n1}\cup\{n\}\). In both cases, the induction step follows, as e ^{ n }⊨φ _{1} and e ^{ n }⊨φ _{2}.
In case of (ii.b), l ^{ n }=l ^{ n−1}. By the algorithm, the last element in l ^{ n }, if it exists, must be of the form (n′,n″) with n′≤n″<n. Thus \(\overline{l}^{n} = \overline{l}^{n1}\cup\{n\}\). Again, the induction step follows in this case.
Case (iii): By the algorithm, l ^{ n }=((0,∞)). Thus \(\overline{l}^{n} = \emptyset\). Since \(e^{n} \not\models \varphi_{2}\), the induction step follows in this case.
Case (iv): By the algorithm, and since n>0, l ^{ n }=((0,n−1)). Thus \(\overline{l}^{n} = \{n\}\). Since e ^{ n }⊨φ _{1}, the induction step follows in this case. □
We are now in the position to prove Theorem 3.
Proof of Theorem 3
We distinguish two cases for n, namely (i) n<min(J), and (ii) n≥min(J).
(i) In case n<min(J), interval [max(0,n−max(J)),n−min(J)] is empty, and e ^{ n }⊨φ _{1} S _{ J } φ _{2} is trivially false. Since the algorithm returns false in this case, the theorem follows for Algorithm 3 without the feasibility check for case (i).
(ii) In case n≥min(J), interval I=[max(0,n−max(J)),n−min(J)] is nonempty. Thus e ^{ n }⊨φ _{1} S _{ J } φ _{2} holds iff there exists an i∈I for which e ^{ i }⊨φ _{2} and for all k,i<k≤n, e ^{ k }⊨φ _{1}. From Proposition 1 we know that this is the case iff there exists an i∈I with \(i\in\overline{l}^{n}\). The latter is the case iff there exists no tuple (τ _{ s },τ _{ e }) in l ^{ n } with valid ^{⊡}((τ _{ s },τ _{ e }),n,J). Since, for n≥min(J), the algorithm returns true iff this is the case, the theorem follows for Algorithm 3 without the feasibility check for case (ii).
It remains to show that the theorem holds for Algorithm 3 with original line 8. If we can show that from ¬feasible((τ _{ s },τ _{ e }),n,J) follows ¬valid ^{⊡}((τ _{ s },τ _{ e }),n′,J), for all times n′≥n, we may safely remove tuple (τ _{ s },τ _{ e }) from the algorithm’s list without changing the algorithm’s return value.
Assume that valid ^{⊡}((τ _{ s },τ _{ e }),n′,J) holds, with n′≥n. We distinguish two cases for n′: (a) n′<max(J) and (b) n′≥max(J):
(a) In case n′<max(J), it follows from valid ^{⊡}((τ _{ s },τ _{ e }),n′,J) that T.τ _{ s }=0 and T.τ _{ e }≥n′−min(J)≥n−min(J). Thus feasible((τ _{ s },τ _{ e }),n,J) holds.
(b) Otherwise n′≥max(J), and it follows from valid ^{⊡}((τ _{ s },τ _{ e }),n′,J) that T.τ _{ s }≤n′−max(J) and T.τ _{ e }≥n′−min(J). Thus \(T.\tau_{e}T.\tau_{s} \le\operatorname{len}(J)\) and thereby feasible((τ _{ s },τ _{ e }),n,J).
The theorem follows. □
With the two definitions in (1), an observer algorithm implementing e ^{ n }⊨⊡_{ J } φ can be deduced from Algorithm 3 by negating its input, its output, and replacing the if condition in line 2 by true. Since the obtained algorithm is equivalent to Algorithm 2, Theorem 2 immediately follows.
4.5 Garbage collection
In the following, we show the correctness of our garbage collection strategy for any of the proposed algorithms: We first show that if a tuple T is allowed to be removed by the garbage collector at time n, it cannot satisfy valid ^{⊡} at that time or at any later time. It is thus safe to remove it from the list.
Lemma 1
If garbage(T,n,J), then ¬valid ^{⊡}(T,n′,J) for all n≥n′.
Proof
Assume that garbage(T,n,J) holds. Then T.τ _{ e }<n−min(J)≤n′−min(J). Since T.τ _{ e }≥n′−min(J) is necessary for valid ^{⊡}(T,n′,J) to hold, the lemma follows. □
We next show that always a prefix of a list is removed. This allows the garbage collector to evaluate garbage iteratively, starting from the head of the list.
For that purpose we introduce additional notation. We write “…” for a potentially empty sequence of tuples. For example, (…,T,T′,…) denotes a list of length at least two, where T and T′ are any two successive elements in this list.
Lemma 2
Let l=(…,T,T′,…) be the list of any of the proposed observer algorithms at time \(n\in\mathbb{N}_{0}\). If garbage(T′,n,J), then garbage(T,n,J).
Proof
Assume that garbage(T′,n,J) holds. Then T′.τ _{ e }<n−min(J). By observing that all of the proposed algorithms ensure that T.τ _{ e }≤T′.τ _{ e } for successive list elements T and T′, we obtain T.τ _{ e }<n−min(J), i.e., garbage(T,n,J) holds. The lemma follows. □
We next prove an upper bound on the length of Algorithm 2 or Algorithm 3’s lists. We start by showing that there is a minimum distance between successive elements in the algorithms’ lists.
Lemma 3
Let l=(…,T,T′,…) be the list of any of the proposed observer algorithms at time \(n\in\mathbb{N}_{0}\). Then T.τ _{ e }+2≤T′.τ _{ s }.
Proof
Consider Algorithm 2. By the algorithm, tuple T must have been added by line 8. For line 8 to add T=(T.τ _{ s },n−1), transition Open image in new window of φ must have occurred at time n. Thus the next tuple added to the list at a time n′>n must have been of the form (n′,∞). Since, by the algorithm, then T′.τ _{ s }≥n′ must hold, we further obtain T′.τ _{ s }≥(n−1)+2=T.τ _{ e }+2. The lemma follows for Algorithm 2.
For Algorithm 3 the lemma follows by analogous arguments. □
Further the first element in the list that was not removed by the garbage collector cannot be of arbitrary age:
Lemma 4
Consider a timebounded formula ⊡_{ J } φ, Open image in new window , or φ _{1} S _{ J } φ _{2}. Let l=(T,…) be the list of the proposed respective observer algorithm at time \(n\in\mathbb{N}_{0}\), after garbage collection has run at time n. Then T.τ _{ e }≥n−min(J).
Proof
It must hold that garbage(T,n,J) is false, since otherwise T would have been removed by the garbage collector. Thus T.τ _{ e }≥n−min(J). □
Lemma 5
Let l be the list of any of the proposed observer algorithms at time \(n\in\mathbb{N}_{0}\), after garbage collection has run at time n, and assume that l is nonempty. Let T ^{ k }=ℓ[k], for 1≤k≤ℓ. Then \(T^{k}.\tau_{e}\ge n\min(J)+(k1)(2+\operatorname{len}(J))\).
Proof
The proof is by induction on the number k≥1 of the element in the list.
Begin (k=1): Immediately follows from Lemma 4.
We may now derive an upper bound on the number of list elements for all our observer algorithms:
Theorem 4
Proof
In case l is empty the lemma follows trivially. Assume l=(T ^{1},…,T ^{ k }) is nonempty. We distinguish two cases for T ^{ k }:
4.6 Discussion of space and time complexity
An alternative to storing absolute times in the observer’s list, is to adapt the observer algorithms in a way such that only relative times are stored. While this potentially reduces the bound of Eq. (8) by substituting log_{2}(n) with log_{2}(max(J)), it requires updating of the list elements (as these then contain relative times) at every time \(n\in\mathbb{N}_{0}\). Since this would require more complex hardware mechanism and result in a slower online algorithm, we decided not to follow this path in our hardware implementation.
We next show that garbage collection allows one to reduce time complexity of the proposed observers. The timedetermining part of Algorithms 2 and 3 is the evaluation of the predicate valid ^{⊡} for all list elements in line 11 and line 19 respectively. However, garbage collection makes it possible to only evaluate the predicate for the first element in the list, thus greatly improving time complexity of the proposed algorithms:
Lemma 6
Let l=(T,…,T′,…) be the list of any of the observer algorithms at time \(n\in\mathbb{N}_{0}\), after garbage collection has run at time n. Then ¬valid ^{⊡}(T′,n,J).
Proof
Assume by means of contradiction that valid ^{⊡}(T′,n,J) holds. Then T′.τ _{ s }≤max(0,n−max(J))≤max(0,n−min(J)). For both Algorithms 2 and 3 we observe that T.τ _{ e }<T′.τ _{ s } has to hold. Thus T.τ _{ e }<max(0,n−min(J)). Since neither Algorithms 2 nor 3 add tuples with a negative τ _{ s } or τ _{ e } component, we obtain that T.τ _{ e }<n−min(J) has to hold and by that garbage(T,n,J) holds. A contradiction to the fact that garbage collection has been run at time n: it would have removed tuple T in that case. The lemma follows. □
5 Mapping the framework into hardware structures
5.1 Interfacing the system under test
Our runtime verification unit (see Fig. 5) connects to various systems through wiretapping of the SUT’s communication interfaces, as outlined in Fig. 1. The attachment to these communication interfaces is application specific. In its current shape, we implemented bus interfaces for systems operating with: RS232 (serial port), CAN (vehicle bus), Wishbone (SystemonChip interconnect), I^{2}C (multimaster serial bus), and JTAG (boundary scan) variants.
5.2 Registers and lists of pairs of time points
Registers are implemented by, for example, linking multiple flipflops. The width of such a register equals to the width of the (upper bounded) time points issued by the RTC plus two additional bits. These additional bits enable indication of overflows when performing arithmetics on time points and indication of the special value ∞. For lists of pairs of time points, we turn to block RAMs, which we organize as ring buffers. Each ring buffer is managed by a unit that controls its read pointer (RP) and its write pointer (WP).
5.3 Realtime clock
The progression of time is measured by a digital clock, i.e., the realtimeclock (RTC), which contains a counter and an oscillation mechanism that periodically increments the counter [48, Chap. 3]. For an onchip RVU solution, the oscillation mechanism can also be bounded to the global system clock of the SUT. Note that the design also allows for an instantiation of a fully external clock which is decoupled from the SUT, such as a GPS receiver. Time points are internally stored in registers of width w=⌈log_{2}(N)⌉+2, where N is the maximum time (in terms of ticks of the RTC) expected to occur during a run of the SUT. The two additional bits enable indication of overflows when performing arithmetical operations on time points and indication of ∞.
Note that our proposed algorithms (cf. Sect. 4) make use of absolute time points, i.e., we store time points for both Open image in new window and Open image in new window transitions of an event e. In contrary, we could also use a mixed representation of absolute and relative time points, i.e., store the absolute time points of the Open image in new window transition of event e and then count the duration of e (the number of clock ticks until the Open image in new window transition occurs). While the latter would help to improve the averagecase memory requirements in a softwareoriented implementation, the former is superior in terms of a hardware implementation: In a hardware design, memory needs to be statically assigned at design time; thus registers have to be of width w rendering the benefits of relative time points. Further storing relative time points would require an additional counter of width w for all atomic propositions and subformulas that use time points.
5.4 Evaluation of atomic propositions
Ideally, with respect to expressiveness of the supported specifications, atomic propositions include arbitrary equalities, inequalities, and disequalities over variables in the state of the SUT. To arrive at a responsive framework, however, an observer needs to guarantee that it finishes evaluation of atomic propositions within a tight time bound. It is therefore necessary to establish a balance between (hardware) complexity of the resulting observer and expressiveness. To achieve this balance, we restrict the class of atomic propositions supported by our framework in a way inspired by the socalled logahedron abstract domain [45], frequently used in the field of abstract interpretation [24].
Specifically, the class of supported atomic propositions consist of conjunctions of linear constraints, where each constraint ranges over two variables. In addition, each variable can be negated and multiplied by a power of two. In our implementation, we support atomic propositions that are restricted linear constraints ranging over values transferred through an interface of the SUT. Specifically, atomic propositions are of the form (±2^{ n }⋅v _{1}±2^{ m }⋅v _{2})⋈c, where v _{1} and v _{2} are application specific symbols, \(c,n,m \in\mathbb{Z}\) and ⋈∈{=,≠,≤,≥,>,<}. For example, when the RVU is connected to a microcontroller data bus (cf. Fig. 1), v _{1} (and v _{2}) can be interpreted as the value stored in a memory location, which in turn, maps to a program variable.
Example
Consider the ptMTL formula φ= (↑(2⋅v _{1}+v _{2}≤68))→(⊡_{[5,10]}(4⋅v _{3}=20∨v _{4}=40)). Assume that the runtime verification framework is instantiated as shown in the topright part of Fig. 1, i.e., it monitors a microcontroller core. The atomic propositions {σ _{1},σ _{2},σ _{3}} of φ are: σ _{1}≡(2⋅v _{1}+v _{2}≤68), σ _{2}≡(4⋅v _{3}=20), and σ _{3}≡(v _{4}=40). The symbols v _{1},…,v _{4} relate to memory locations stored in the microcontroller RAM. Together with debug information from the compiler they can be linked to highlevel language symbols, e.g., C code variables. Evaluating {σ _{1},σ _{2},σ _{2}} requires three AtChecker blocks. For example, to evaluate σ _{1}, an AtChecker is configured to load new data from the SUT interface as soon as new values for either v _{1} or v _{2} are transferred. Its shifter is programmed to shift v _{1} one position to the left and the arithmetic unit so as to calculate the sum of 2⋅v _{1} and v _{2}. The comparator then compares this result with the constant 68 and finally outputs the truth value of σ _{1} at the current time point n.
5.5 Runtime observers
Evaluating the observer algorithms’ predicates
Subtraction and relational operators as required by the predicates feasible, garbage, and valid can be built around adders. Observe that, when Add(〈a〉,〈b〉,c) is a ripple carry adder for arbitrary length unsigned vectors 〈a〉 and 〈b〉 and c the carry in, then a subtraction of 〈a〉−〈b〉 is equivalent to \(\mathsf{Add} (\langle\mathsf{a}\rangle,\langle\overline{\mathsf{b}}\rangle, 1)\). Relational operators can be built around adders in a similar way [49, Chap. 6]. For example (left part of Fig. 7), valid ^{⊡}((τ _{ e },τ _{ s }),n,J) is implemented using five wbit adders: one for q:=n−min(J), one for r:=T.τ _{ e }≥q, one to calculate p:=n−max(J) and two to calculate t:=T.τ _{ s }≤max(p,0). Finally, the unit outputs the verdict t∧r, where t and r are calculated in parallel. To evaluate Open image in new window the unit uses three wbit adders, one to determine q:=n−τ, one for p:=q>0, and a third to either calculate Open image in new window or Open image in new window , depending on the truth value of p. Finally, the validity checker outputs the verdict r to the ptLTL evaluation unit. Note that, for the actual implementation, we do not explicitly calculate q:=n−min(J) through an adder. Instead, the design is configured with an absolute time point that signalizes the end of the startup phase, which equals to max(J)+1. A dedicated signal is cleared at reset and asserted once n=max(J)+1, therefore, replacing an adder by a more resource friendly comparator circuit in the implementation for the valid ^{⊡}((τ _{ e },τ _{ s }),n,J) predicate.
Lists and garbage collection
For a list \(l_{\boxdot_{J}\varphi}\) we turn to block RAMs (abundant on contemporary FPGAs) which are organized as ring buffers (right in Fig. 7). Each ring buffer has a read (rp) and a write pointer (wp). To insert a time point pair that satisfies feasible((τ _{ s },n−1),n,J)), wp is incremented to point to the next free element in the ring buffer. The GC then adjusts rp to indicate the latest element with regard to n and J that is recent enough. In a fresh cycle (indicated by a changed time point n), the GC loads (τ _{ s },τ _{ e }) using rp, which is incremented iff garbage((τ _{ s },τ _{ e }),n,J) holds.
Control logic and modularity
The control logic as shown in Fig. 7 allows one to easily reconnect hardware observers according to the specification’s parse tree, which entails that the specification can be modified (within resource limitations) without resynthesizing the whole design, which could take tens of minutes for FPGA designs.
5.6 A microcomputer to evaluate ptMTL and ptLTL specifications
Workflow
A (GUIbased) observergeneration application on a host computer compiles a ptMTL specification φ into a triple 〈Π,C _{ a },C _{ m }〉, where C _{ a } is a configuration for the AtChecker, C _{ m } is a configuration for the pool of time bounded MTL operators and Π is a native program for the μ Spy.
 (1)
We use the ANTLR parser generator [61] to parse φ. This step yields an abstract syntax tree (AST) that represents the specification.
 (2)
After some preprocessing of the AST, we determine the m subformulas φ _{1},…,φ _{ m } of φ by using a postorder traversal.
 (3)For each subformula φ _{ i }, 1≤i≤m:

If φ _{ i } is an atomic proposition, instantiate an AtChecker block and add its configuration to C _{ a }.

If φ _{ i } is a ptLTL formula, we use the approach shown in [68, 70] to generate a native instruction for the μ Spy and add the instruction to Π.

If φ _{ i } is a ptMTL formula, we instantiate the corresponding observer hardware block, generate the hardware block’s configuration and a native instruction for the μ Spy. We add the configuration to C _{ m } and the instruction to Π.

After running steps (1–3) of the synthesis procedure, the resulting configuration 〈Π,C _{ a },C _{ m }〉 is then transferred from the host computer to the hardware platform where the μ Spy is instantiated on, e.g., from the host computer through an Universal Serial Bus (USB) to an FPGA. We note that the host computer is only required to generate such a configuration for the current specification, but is not required during monitoring.
Instruction set architecture
OpCode  Addr. Operand 1  Addr. Operand 2  Interval Addr.  List Addr. 

5 bit  2+8 bit  2+8 bit  8 bit  7 bit 
Architectural features
The μ Spy manages two memories p[0,…,m−1] and q[0,…,m−1], one containing the evaluations of all m subformulas of φ (generated in a postorder traversal of the parse tree of φ; in step (1) of the synthesis procedure) in the current and in the previous execution cycle (i.e., time points n and n−1). This allows for space and time efficient evaluation of formulas whose parse tree is a directed acyclic graph, and not necessarily a tree. For example, to evaluate the formula φ≡ (↑σ _{1})≡ σ _{1}∧¬⊙σ _{1}, one is not required to evaluate both σ _{1} and ⊙σ _{1} independently, and thus σ _{1} twice. Rather, we will have two registers of length 1, i.e., p[0] holds the result of σ _{1} from the previous round and q[0] from the current round. The μ Spy then fetches both p[0] and q[0] and executes the instruction that represents the operator ↑, which maps to the Boolean operation (q[0]⊕p[0])∧q[0], namely, σ _{1} did toggle its truth value (q[0]⊕p[0] holds) and σ _{1} is true in the current state (q[0] holds). Each instruction is processed through a fourstage pipeline (fetch, load, calc, and write back). All stages except the calc stage require one clock cycle per instruction, the execution time of the calc stage depends on the operator and requires from one to four clock cycles.
Execution time per operator
μ Spy clockcycles for Boolean, ptLTL, and ptMTL operators
Logic  Operator  μ Spy clock cycles 

Boolean  ¬φ  1 
φ _{0}•φ _{1},•∈{∧,∨}  1  
ptLTL  ⊙φ  1 
φ _{1} S φ _{2}  1  
ptMTL  2  
4  
φ _{1} S _{ J } φ _{2}  4 
Example
Consider the ptMTL property φ≡ (↑(2⋅v _{1}+v _{2}≤68))→(⊡_{[5,10]}(4⋅v _{3}=20∨v _{4}=40)). As in the example of Sect. 5.4, the atomic propositions {σ _{1},σ _{2},σ _{3}} of φ are evaluated by three AtChecker units. The subformulas ↑(2⋅v _{1}+v _{2}≤68) and (4⋅v _{3}=20 ∨ v _{4}=40) are checked by the μ Spy. For example, the value of σ _{1} and the result of σ _{1} from time n−1 is used by the calc stage, which decides if ↑σ _{1} holds at the current time. The process is similar to determine the truth value of σ _{2}∨σ _{3}, the result of which is used as input to calculate ⊡_{[5,10]} (σ _{2}∨σ _{3}). The observer block is configured through the interval memory so as to represent J=[5,10]. The output of the ⊡_{[5,10]} (σ _{2}∨σ _{3}) calculation is then the input to the final ptLTL computation, i.e., φ≡ (↑σ _{1})→(⊡_{[5,10]}(σ _{2}∨σ _{3})).
6 Evaluation
To demonstrate the feasibility of our approach, we implemented the presented algorithms for ptMTL monitoring by means of the μ Spy on an FPGA platform. In the current implementation, subformulas are evaluated sequentially as they appear in the specification’s parse tree. Since the observer blocks are executed in sequence, their logic elements can be reused and it suffices to equip the μ Spy with only one Open image in new window , one ⊡_{ J } φ, and one φ _{1} S _{ J } φ _{2} hardware observer block and assign memory according to the number of subformulas.^{2} The implementation is a synchronous registertransferlevel VHDL design, which we both simulated in Mentor Graphics ModelSim and synthesized for various FPGAs using the industrial logic synthesis tool Altera Quartus II.^{3}
6.1 Simulation results
We conducted several simulation runs of the VHDL implementation of the μ Spy unit when monitoring different ptMTL formulas with randomly generated inputs, representing the execution traces of an SUT. The simulation runs cover several combinations of the ptLTL operators ↑, ⊙, and φ _{1} S _{ s } φ _{2} as well as the timebounded ptMTL operators Open image in new window , Open image in new window , and φ _{1} S _{ J } φ _{2}. The truth values of the involved atomic propositions {σ _{0},σ _{1},σ _{2}} were generated by placing 1000 truth value transitions with uniformly distributed interarrival times on the discrete timeline. In all simulated executions, our implementation behaved as specified. To increase confidence in the implementation, we used an automatic test suite, which checks the generated executions not only with the μ Spy, but also with (i) a software implementation of our observer algorithms and (ii) a naive offline monitoring algorithms following the semantics definition of ptLTL and ptMTL. We run this setup with a set of sample specifications and compared the output of the three implementations and iteratively fixed remaining bugs. We used traditional line coverage metrics to assess the test progress. A rigorous, formal correctness analysis of the μ Spy implementation, however, is still an open issue.
Simulation signals and their meaning; AH = Active High (issued when high); AL = Active Low (issued when low), and RTC = Real Time Clock
Signal Name  Unit  Meaning 

s_clk  RVU  system clock of the RV framework 
s_reset_n  asynchronous reset of the RV framework (AL)  
s_sut_clk  system clock of the SUT  
s_rtc_timestamp  RTC  ctr. value of the realtime clock (i.e., time point n) 
s_atomic(0)  SUT  truth value of atomic proposition # 0, σ _{0} (AH) 
s_atomic(1)  truth value of atomic proposition # 1, σ _{1} (AH)  
s_atomic(2)  truth value of atomic proposition # 2, σ _{2} (AH)  
s_atomic(3)  truth value of atomic proposition # 3, σ _{3} (AH)  
s_violated  RVU  monitoring output e ^{ n }⊨φ (AH) 
command  μ Spy  instruction (opcode) for the μ Spy 
state  state of the fetch stage state machine  
state  state of the load stage state machine  
state  state of the calc stage state machine  
state  state of the write back stage state machine  
interval_min  min(J) (in RTC ticks)  
interval_max  max(J) (in RTC ticks)  
sel  List  select the list specified by buffer_nr (AH) 
add_start  add start ( Open image in new window ) time point to the list (AH)  
add_end  add end ( Open image in new window ) time point to the list (AH)  
set_tail  clear list and add new entry (AH)  
reset_tail  clear list and add entry with time point 0 (AH)  
drop_tail  remove tail element from the list (AH)  
delete  remove head element from the list (AH)  
buffer_nr  id of the currently used list (AH) 
(a) Invariant previously Open image in new window
01011 0000000000 0000000000 00000000 0000000 // rising edge at a(0)
10001 0000000001 0000000000 00000001 0000000 // [[]] a(1), i(1), mem 0
00110 1000000000 1000000001 00000000 0000000 // m(0) > m(1)
11111 1000000010 0000000000 00000000 0000000 // output result m(2)
and into the following data for the interval memory:
0000000000000000 0000000000000110 // startup phase duration: 6
0000000000000000 0000000000000101 // [0, 5]
The binary program consists of three subformulas and a dedicated end instruction. The interval memory holds two entries, the first denotes the duration of the startup phase in RTC clock cycles and the second entry holds τ=5 for the Open image in new window operator. The startup phase signal is then used to implement the check whether n−τ≥0 in the Open image in new window predicate.
(b) Since within interval φ _{1} S _{ J } φ _{2}
01011 0000000000 0000000000 00000000 0000000
// rising edge at a(0)
10011 0000000001 0000000010 00000001 0000000
// a(1) S a(2), i(1), mem 0
00110 1000000000 1000000001 00000000 0000000
// m(0) > m(1)
11111 1000000010 0000000000 00000000 0000000
// output result m(2)
and into the following data for the interval memory:
0000000000000000 0000000000001011
// startup phase duration: 11
0000000000000101 0000000000001010 // [5, 10]
The instruction memory contains three instructions corresponding to the three operators in the formula. Figure 9b shows a snippet of the corresponding simulation trace. At time n=69 a Open image in new window transition of s_atomic(2) is detected and according to Algorithm 3, n−1=68 is added to the list l _{ S } of the S observer which is triggered by the add_end signal. At time n=74 the predicate garbage evaluates to true (since (68<74−min(5,10)) holds) and triggers the deletion of the element in the list. The signal delete is asserted. The Open image in new window transition of s_atomic(2) at time n=82 triggers the adding of the intervalstart time point to l _{ S } (see Algorithm 3 line 4). Consequently (82,∞) is the new head element of l _{ S }. Starting from time n=84 on s_atomic(1) and s_atomic(2) are false, which, according to Algorithm 3, sets the list to (0,∞). This is done through the reset_tail signal. At time n=92 we see a Open image in new window transition of s_atomic(0) which yields e ^{92}⊨ (↑σ _{0}). The valid ^{⊡} predicate evaluates as follows: (0≤92−max(5,10))∧(∞≥92−min(5,10)), yielding true. Finally, we obtain \(e^{92} \not\models\varphi_{2}\) and the violated signal is asserted.
6.2 Performance study
Recall, that our hardware implementation uses one hardware module for Open image in new window and Open image in new window observers, one for the ⊡_{ J } φ and Open image in new window observers, and one for φ _{1} S _{ J } φ _{2} observers. The latter two modules both require lists of the same size, therefore, scale identically with respect to operating frequency, logic elements, and required memory size. We thus treated them equally within the performance study.
Scalability
7 Related work
This section surveys related work by focusing on frameworks and tools, theoretical results on observer algorithms, and approaches that perform runtime verification either in or of hardware designs.
Frameworks and tools
Watterson and Heffernan [80] review established and emerging approaches for monitoring (software) executions of embedded systems; calling for future work on runtime verification approaches that utilize existing chip interfaces to provide the observations as events to an external monitoring system. Pike et al. [64] worked on runtime verification for realtime systems by defining observers in a dataflow language, which are compiled into programs with constant runtime and memory. If the original system is periodically schedulable with some safety margin, the monitored system can be shown to be schedulable, too. This approach targets software only, whereas we monitor a combination of embedded software and hardware components. Hardware observers that simply probe one or more internal signals have been known in literature for a few decades. An early instance thereof is the noninterference monitoring and replay mechanism by Tsai et al. [79]. Their monitoring system is based on the MC6800 processor that records the execution history of the target system. A dedicated replay controller then replays stored executions, which supports test engineers in lowlevel debugging. Although we share a similar idea of probing internal signals, our framework detects specification violations onthefly, rather than replaying traces from some execution history.
The Dynamic Implementation Verification Architecture (DIVA) exploits runtime verification at intraprocessor level [5]. Whenever a DIVAbased microprocessor executes an instruction, the operands and the results are sent to a checker which verifies correctness of the computation; the checker also supports fixing an erroneous operation. Chenard [19] presents a systemlevel approach to debugging based on insilicon hardware checkers. The work of Brörkens and Möller [18] is akin to ours in the sense that they also do not rely on code instrumentation to generate event sequences. Their framework, however, targets Java and connects to the bytecode using the Java Debug Interface (JDI) so as to generate sequences of events.
BusMOP [62] generates observers for ptLTL on FPGAs, which are connected to the Peripheral Component Interconnect (PCI). The commercial Temporal Rover system [29] implements observers for MTL formulas, but the implementation and algorithms used are not published.
Observer algorithms
We restrict our survey to ptMTL observer algorithms for past time logics in the discretetime setting.
Thati and Roşu [78] presented an online observer for MTL formulas ψ. Their idea is to reduce the problem of deciding whether e ^{ n }⊨ψ to deciding several instances of e ^{ n′}⊨ψ′, where ψ′ is a subformula of ψ and n′≤n. Thereby for each subformula φ _{1} S _{[a,b]} φ _{2} of ψ, the formulas φ _{1} S _{[a−1,b−1]} φ _{2}, φ _{1} S _{[a−2,b−2]} φ _{2}, …, φ _{1} S _{[0,b−a]} φ _{2}, …, φ _{1} S _{[0,0]} φ _{2} are defined to be subformulas of ψ. For example, in case ψ≡φ _{1} S _{[1,3]} φ _{2}, where φ _{1} and φ _{2} are atomic propositions, the reduced formulas of ψ are φ _{1}, φ _{2} as well as φ _{1} S _{[0,2]} φ _{2}, φ _{1} S _{[0,1]} φ _{2}, and φ _{1} S _{[0,0]} φ _{2}. Denoting by m the number of subformulas an MTL formula ψ is reduced to, the space complexity of their observer is within \(\mathcal{O}(m 2^{m})\) and its time complexity is within \(\mathcal{O}(m^{3} 2^{3m})\) for each time n in \(\mathbb{N}_{0}\), the observer is executed. For the special cases of ψ≡φ _{1} S _{ J } φ _{2}, the observer still requires a memory of at least 2m≥2max(J) bit. While this bound is incomparable in general to our bound, for large values of max(J) we immediately obtain that our solution has less memory complexity. For example for φ _{1} S _{[5,1500]} φ _{2} the solution in [78] requires at least 3000 bit of memory, whereas our observer requires 208 bit, assuming (upper bounded) time points of 52 bit.
Maler et al. [57] presented an online observer algorithm for φ _{1} S _{ J } φ _{2} that is based on having active counters for each event of φ _{2}. Divakaran et al. [26] improved the number of counters of bit width logmax(J) to \(2\lceil\min(J)/(\operatorname{len}(J))\rceil+ 2\) and proved that any Since observer realized as a timed transition system must use at least \(2(\lceil\min(J))/(\operatorname{len}(J))\rceil+ 1\) clocks. While their space complexity is incomparable to ours in general, their solution is very resource intensive for a hardware realization: While we may store list values in cheap RAM blocks, their solution requires to store the current counter values in registers, since their values are incremented at every time step. Further, one can show by simple algebraic manipulations that:
Proposition 2
Proposition 3
From Proposition 2 immediately follows that our observer requires at most two tuples in addition to the (counter) tuples required by Divakaran et al.’s observer. On the other hand, it follows from Proposition 3 that there exists a choice of parameters where our observer requires significantly less memory.
In contrast to the solution presented by Divakaran et al. [26], our solution is tailored to a discrete time base, dictated by our application domain: not only that at the hardware level a (discrete) system clock is naturally available, but also adding and comparing fractions would incur a significant overhead with respect to latency and circuit size. Nonetheless, our algorithms also work in the dense time domain with only two small modifications: (i) instead of running the algorithms at every time \(n\in\mathbb{N}_{0}\), they need to be executed at every transition of an input signal, and (ii) the term “n−1” must be replaced by “n” in Algorithms 2 and 3. By analogous proofs we obtain that, in this case, list ℓ is of size at most \((\max(J))/(\operatorname{len}(J))+1\) tuples, which is at most one more than the number of clocks required by the Since observer by Divakaran et al. [26].
Basin et al. [11] present a (discrete time) pointbased observer for formula φ _{1} S _{ J } φ _{2} which runs in time \(\mathcal{O}(\log\max(J\cup\{n\}))\) if executed at time \(n\in \mathbb{N}_{0}\). Their algorithm, however, requires memory in the order of max(J). They further presented an intervalbased observer algorithm for φ _{1} S _{ J } φ _{2} with space complexity comparable to our solution. However, the algorithm is clearly motivated with a software implementation in mind, whereas we aim at efficient (highly parallel) circuit implementations. For example, for an arbitrary ptMTL formula φ, our timecomplexity bounds scale with the depth of the parse tree of φ, in case the μ Spy executes observer algorithms in parallel, and with the number of nodes in the parse tree of φ, in case the μ Spy executes observer algorithms sequentially. By contrast, the bounds in [11] scale with the fourth power of the number of nodes in the parse tree of φ. Further, a direct implementation of their algorithm would require considerable hardware overhead, as it makes use of doublylinked lists to store and manipulate time points. In comparison, our ring buffer design can easily be mapped to block RAM elements that are abundant on modern day FPGAs.
Hardware observers
In previous work, we have shown that ptLTL can, within certain bounds, be checked in hardware running at the same frequency as the SUT [68]. Assertionbased verification (ABV) [36] gained momentum in industrialstrength hardware verification, especially driven by the emerge of the Property Specification Language (PSL). PSL is based on LTL, augmented with regular expressions, thus, we will not compare our work to PSL monitoring algorithms but rather to the hardware architecture of the resulting checkers. Existing work largely aims at synthesizing hardwired circuits out of various temporal specifications, whereas our approach (a) focuses on ptMTL specifications and (b) aims at providing a reconfigurable framework that has also applications in testing and not only as hardcoded observer. Translations from PSL into hardware either follow the modular or the automata based synthesis.
In the modular approach [14, 15, 25, 27, 60], subcircuits for each operator are built and interconnected according to the parse tree of the PSL expression being monitored. These circuits then output a pair of signals indicating the status of the assertion. Boulé and Zilic [15] present a hardwarechecker generator capable of supporting ABV, by translating PSL to hardware language descriptions that can be included into the source design. The input to their circuit generator is the source file of the design under test (DUT). This limits their approach to designs where the source is available, whereas our framework can be attached to a variety of targets (cp. Fig. 1), even third party proprietary systems. Unfortunately, their algorithms lack a complexity analysis. Borrione et al. [14] describe a method of translating properties of the PSL foundation layer into predefined primitive components. A component is a hardware unit, consisting of a checking window and an evaluation block. They make use of shift register chains in the checking window block to trigger the execution of the evaluation block. Primitive components representing a timed operator (e.g., within in the next τ time units), need to individually count the number of elapsed time points. Das et al. [25] presented a modular approach by decomposing System Verilog Assertions (SVA) into simple communicating parallel hardware units that, when connected together, act as an observer for a SVA. MorinAllory and Borrione [60] describe a generation of synthesizable hardware from regular expressions included in PSL. Drechsler [27] describes an approach to synthesize checkers for online verification of SoC designs through chains of shift registers, but does not allow for checking arithmetic relations among bitvectors. For hardware designs, these specifications are often directly available from the specification [75].
In the automata based approach [4, 16, 17, 37, 38, 56], state machines are synthesized that check a property during simulation. The generated automata are generally of nondeterministic nature. To avoid a blowup of the automaton capable of monitoring formulas that are required to hold for a certain number of clock cycles, additional counters are inserted. However, this is only feasible if the output language natively supports nondeterministic finite automata (NFA), unfortunately, major hardware descriptions languages (e.g., Verilog and VHDL) do not. Consequently, observers need to be converted to a deterministic finite automaton (DFA) first, which, in the worst case, yields an exponential blowup of the resulting DFA in the size of the NFA [43]. This theoretical limitations were also reflected in the experiments of Straka et al. [76] where they report on an attempt to verify trivial properties of a simple counter, where the resulting observers synthesized by FoCs [1] from a PSL specification requires 120 logic slices whereas the resources for the counter itself accounts only for 3 slices. This performance issues motivate them to turn to a selfmade tool to design online checkers instead of using existing toolchains. Lu and Forin [56] present a compiler from Psl to Verilog, which translates a subset of Psl assertions (sPsl, a Clanguage binding for Psl [20]) about a software program (written in C in their approach) into hardware execution blocks for an extensible MIPS processor, thus allowing for transparent runtime verification without altering the program under investigation. The synthesized verification unit is generated by a property rewriting algorithm developed by Roşu and Havelund [72]. Atomic propositions are restricted to a single comparison operator only. For comparison, our approach supports more complex relations among memory values in the atomic propositions, thus yielding greater flexibility and expressiveness in the specification language. Armoni et al. [4] describe an automatatheoretic construction based on determinization for unrestricted temporal logic, i.e., ForSpec [3]. They showed how to obtain deterministic compilation targeting dynamic verification that is as close as possible to the nondeterministic compilation of temporal assertions.
8 Conclusion
We presented an online runtime verification framework to check a ptMTL formula on executions with discrete time domain. At the framework’s heart is an observer design for the timebounded Since operator and the special cases of exists/invariant previously and within interval. Correctness proofs of all presented algorithms have been given and bounds on their time and space complexity have been proven. The promising complexity results are mainly due to the integration of a garbage collection and a filtering strategy that automatically drop events that can neither validate nor invalidate the specification.
We further discussed a reconfigurable hardware realization of our observer algorithm that provides sufficient flexibility to allow for changes of the monitored specification without necessarily resynthesizing the hardware observer. Reconfigurability is indeed a valuable property of the presented approach since logic synthesis is itself a very timeconsuming task. To demonstrate the feasibility of our approach for practical applications, we implemented the algorithms on a Field Programmable Gate Array. The predictable and low resource requirements of the presented hardware solution together with its reconfigurability support the application in the diagnosis of embedded realtime systems during execution time.
Based on the framework presented in this article, we plan to investigate the following directions: who guards the guardians? [74] is a legitimate question with regard to the implementation of our runtime verification unit. Whereas we gave a formal correctness analysis for the algorithms itself, however, doing so for the implementation is an open issue. Additionally, we plan to extend our work to (bounded) future time MTL specifications.
Footnotes
 1.
In our framework, we thus assume time points to be from \(\mathbb{N}_{0}\).
 2.
In our experiments, we opted for a resource efficient design of the μ Spy. A configuration of the μ Spy with multiple ptMTL hardware observers immediately makes an evaluation of several subformulas in parallel possible, however, increase resource requirements.
 3.
Tools can be downloaded from http://www.mentor.com and http://www.altera.com.
Notes
Acknowledgements
The work of Thomas Reinbacher and Matthias Függer has been supported within the FITIT project CevTes managed by the Austrian Research Agency FFG under grant 825891 and (partially) supported by the Austrian Science Foundation (FWF) under project S11405 (RiSE). The work of Jörg Brauer has been, in part, supported by the DFG Cluster of Excellence on Ultrahigh Speed Information and Communication, German Research Foundation grant DFG EXC 89 and by the DFG research training group 1298 Algorithmic Synthesis of Reactive and DiscreteContinuous Systems. The authors want to thank Dejan Nickovic, Andreas Steininger, Kristin Y. Rozier, and Johann Schumann for helpful discussions. Additionally, the authors want to thank Andreas Hagmann, Johannes Geist, and Patrick Moosbrugger for their help with the hardware implementation and experiments.
References
 1.Abarbanel Y, Beer I, Gluhovsky L, Keidar S, Wolfsthal Y (2000) FoCs: automatic generation of simulation checkers from formal specifications. In: CAV. LNCS, vol 1855. Springer, Berlin, pp 538–542 Google Scholar
 2.Alur R, Henzinger TA (1990) Realtime logics: complexity and expressiveness. In: LICS. IEEE, New York, pp 390–401 Google Scholar
 3.Armoni R, Fix L, Flaisher A, Gerth R, Ginsburg B, Kanza T, Landver A, MadorHaim S, Singerman E, Tiemeyer A, Vardi MY, Zbar Y (2002) The Forspec temporal logic: a new temporal propertyspecification language. In: TACAS. Springer, Berlin, pp 196–211 Google Scholar
 4.Armoni R, Korchemny D, Tiemeyer A, Vardi M, Zbar Y (2006) Deterministic dynamic monitors for lineartime assertions. In: Formal approaches to software testing and runtime verification. LNCS, vol 4262. Springer, Berlin, pp 163–177 CrossRefGoogle Scholar
 5.Austin TM (1999) DIVA: a reliable substrate for deep submicron microarchitecture design. In: MICRO. IEEE, New York, pp 196–207 Google Scholar
 6.Baier C, Katoen JP (2008) Principles of model checking. MIT Press, Cambridge zbMATHGoogle Scholar
 7.Bardin S, Herrmann P, Védrine F (2011) Refinementbased CFG reconstruction from unstructured programs. In: VMCAI. Springer, Berlin, pp 54–69 Google Scholar
 8.Barre B, Klein M, SoucyBoivin M, Ollivier PA, Hallé S (2012) MapReduce for parallel trace validation of LTL properties. In: RV. LNCS. Springer, Berlin Google Scholar
 9.Barringer H, Falcone Y, Finkbeiner B, Havelund K, Lee I, Pace GJ, Rosu G, Sokolsky O, Tillmann N (eds) (2010) Runtime verification—first international conference, proceedings. LNCS, vol 6418. Springer, Berlin Google Scholar
 10.Bartocci E, Grosu R, Karmarkar A, Smolka S, Stoller S, Zadok E, Seyster J (2012) Adaptive runtime verification. In: RV. LNCS. Springer, Berlin Google Scholar
 11.Basin D, Klaedtke F, Zălinescu E (2011) Algorithms for monitoring realtime properties. In: RV. LNCS, vol 7186. Springer, Berlin, pp 260–275 Google Scholar
 12.Bate I, Conmy P, Kelly T, McDermid J (2001) Use of modern processors in safetycritical applications. Comput J 44(6):531–543 CrossRefzbMATHGoogle Scholar
 13.Bauer A, Leucker M, Schallhart C (2010) Comparing LTL semantics for runtime verification. J Log Comput 20(3):651–674 CrossRefzbMATHMathSciNetGoogle Scholar
 14.Borrione D, Liu M, MorinAllory K, Ostier P, Fesquet L (2005) Online assertionbased verification with proven correct monitors. In: ICICT, pp 125–143 Google Scholar
 15.Boulé M, Zilic Z (2005) Incorporating efficient assertion checkers into hardware emulation. In: ICCD. IEEE Computer Society Press, Los Alamitos, pp 221–228 Google Scholar
 16.Boulé M, Zilic Z (2006) Efficient automatabased assertionchecker synthesis of PSL properties. In: Highlevel design validation and test workshop. Eleventh annual IEEE international, pp 69–76 Google Scholar
 17.Boulé M, Zilic Z (2008) Automatabased assertionchecker synthesis of PSL properties. ACM Trans Des Autom Electron Syst 13(1) Google Scholar
 18.Brörkens M, Möller M (2002) Dynamic event generation for runtime checking using the JDI. Electron Notes Theor Comput Sci 70(4):21–35 CrossRefGoogle Scholar
 19.Chenard JS (2011) Hardwarebased temporal logic checkers for the debugging of digital integrated circuits. PhD thesis, McGill University Google Scholar
 20.Cheung PH, Forin A (2007) A Clanguage binding for PSL. In: Proceedings of the 3rd international conference on embedded software and systems. ICESS ’07. Springer, Berlin, pp 584–591 CrossRefGoogle Scholar
 21.Clarke EM (2009) My 27year quest to overcome the state explosion problem. In: LICS. IEEE Computer Society Press, Los Alamitos, p 3 Google Scholar
 22.Clarke EM, Grumberg O, Peled DA (1999) Model checking. MIT Press, Cambridge. ISBN 0262032708 Google Scholar
 23.Colombo C, Pace GJ, Schneider G (2009) Safe runtime verification of realtime properties. In: FORMATS. LNCS, vol 5813. Springer, Berlin, pp 103–117 Google Scholar
 24.Cousot P, Cousot R (1977) Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: POPL Google Scholar
 25.Das S, Mohanty R, Dasgupta P, Chakrabarti P (2006) Synthesis of system verilog assertions. In: DATE, vol 2, pp 1–6 Google Scholar
 26.Divakaran S, D’Souza D, Mohan MR (2010) Conflicttolerant realtime specifications in metric temporal logic. In: TIME, pp 35–42 Google Scholar
 27.Drechsler R (2003) Synthesizing checkers for online verification of systemonchip designs. In: ISCAS, vol 4, pp IV748–IV751 Google Scholar
 28.Druilhe A, Daumas F, Nguyen T (2010) Formal verification of an FPGA emulation of the motorola 6800 microprocessor. In: NPIC&HMIT. American Nuclear Society, New York, pp 1316–1325 Google Scholar
 29.Drusinsky D (2003) Monitoring temporal rules combined with time series. In: CAV. LNCS, vol 2725. Springer, Berlin, pp 114–118 Google Scholar
 30.Dvorak D (ed) (2009) NASA study on flight software complexity. NASA office of chief engineer Google Scholar
 31.Eide E, Regehr J (2008) Volatiles are miscompiled, and what to do about it. In: EMSOFT. ACM, New York, pp 255–264 CrossRefGoogle Scholar
 32.Emerson EA (1990) Temporal and modal logic. In: Handbook of theoretical computer science, vol B. MIT Press, Cambridge, pp 995–1072 Google Scholar
 33.Engblom J (2001) On hardware and hardware models for embedded realtime systems. In: RTSS Google Scholar
 34.Fischmeister S, Lam P (2010) Timeaware instrumentation of embedded software. IEEE Trans Ind Inform 6(4):652–663 CrossRefGoogle Scholar
 35.Flexeder A, Mihaila B, Petter M, Seidl H (2010) Interprocedural control flow reconstruction. In: APLAS. LNCS, vol 6461. Springer, Berlin, pp 188–203 Google Scholar
 36.Foster H, Lacey D, Krolnik A (2003) Assertionbased design, 2nd edn. Kluwer Academic, Norwell CrossRefGoogle Scholar
 37.Gheorghita S, Grigore R (2005) Constructing checkers from PSL properties. In: CSCS’05 international conference on control systems and computer science, pp 757–762 Google Scholar
 38.Gordon M, Hurd J, Slind K (2003) Executing the formal semantics of the accellera property specification language by mechanised theorem proving. In: CHARME 2003. LNCS, vol 2860. Springer, Berlin, pp 200–215 Google Scholar
 39.Havelund K, Roşu G (2004) An overview of the runtime verification tool Java PathExplorer. Form Methods Syst Des 24(2):189–215 CrossRefzbMATHGoogle Scholar
 40.Havelund K (2008) Runtime verification of C programs. In: TestCom/FATES. Springer, Berlin, pp 7–22 Google Scholar
 41.Havelund K, Roşu G (2004) Efficient monitoring of safety properties. Int J Softw Tools Technol Transf 6:158–173 CrossRefGoogle Scholar
 42.Havelund K, Rosu G (2002) Synthesizing monitors for safety properties. In: TACAS. LNCS. Springer, Berlin, pp 342–356 Google Scholar
 43.Hopcroft JE, Motwani R, Ullman JD (2006) Introduction to automata theory, languages, and computation. AddisonWesley Longman, Reading Google Scholar
 44.Horowitz P, Hill W (1980) The art of electronics. Cambridge University Press, Cambridge. ISBN 0521370957 Google Scholar
 45.Howe J, King A (2009) Logahedra: a new weakly relational domain. In: ATVA. LNCS, vol 5799. Springer, Berlin, pp 306–320 Google Scholar
 46.Kinder J, Veith H, Zuleger F (2009) An abstract interpretationbased framework for control flow reconstruction from binaries. In: VMCAI. LNCS, vol 5403. Springer, Berlin, pp 214–228 Google Scholar
 47.Kogge PM, Stone HS (1973) A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans Comput 22(8):786–793 CrossRefzbMATHMathSciNetGoogle Scholar
 48.Kopetz H (2011) Realtime systems, 2nd edn. Springer, Berlin CrossRefzbMATHGoogle Scholar
 49.Kroening D, Strichman O (2008) Decision procedures: an algorithmic point of view. Springer, Berlin Google Scholar
 50.Laroussinie F, Markey N, Schnoebelen P (2002) Temporal logic with forgettable past. In: LICS. IEEE, New York, pp 383–392 Google Scholar
 51.Lee I, Kannan S, Kim M, Sokolsky O, Viswanathan M (1999) Runtime assurance based on formal specifications. In: PDPTA, pp 279–287 Google Scholar
 52.Leroy X (2006) Formal certification of a compiler backend or: programming a compiler with a proof assistant. In: POPL. ACM, New York, pp 42–54 Google Scholar
 53.Leroy X (2009) A formally verified compiler backend. J Autom Reason 43:363–446 CrossRefzbMATHMathSciNetGoogle Scholar
 54.Lichtenstein O, Pnueli A, Zuck L (1985) The glory of the past. In: Logics of programs. LNCS, vol 193. Springer, Berlin, pp 196–218 CrossRefGoogle Scholar
 55.Lindig C (2005) Random testing of C calling conventions. In: AADEBUG. ACM, New York, pp 3–12 CrossRefGoogle Scholar
 56.Lu H, Forin A (2007) The design and implementation of P2V, an architecture for zerooverhead online verification of software programs. Tech rep MSRTR200799, Microsoft Research Google Scholar
 57.Maler O, Nickovic D, Pnueli A (2005) Real time temporal logic: past, present, future. In: FORMATS, pp 2–16 Google Scholar
 58.Manna Z, Pnueli A (1992) The temporal logic of reactive and concurrent systems. Springer, Berlin CrossRefGoogle Scholar
 59.Marwedel P (2011) Embedded system design. Springer, Berlin. ISBN 9789400702578 CrossRefzbMATHGoogle Scholar
 60.MorinAllory K, Borrione D (2006) Proven correct monitors from PSL specifications. In: DATE, pp 1–6 Google Scholar
 61.Parr TJ, Quong RW (1995) ANTLR: a predicatedll(k) parser generator. Softw Pract Exp 25:789–810 CrossRefGoogle Scholar
 62.Pellizzoni R, Meredith P, Caccamo M, Rosu G (2008) Hardware runtime monitoring for dependable COTSbased realtime embedded systems. In: RTSS, pp 481–491 Google Scholar
 63.Pellizzoni R, Meredith P, Caccamo M, Rosu G (2008) Hardware runtime monitoring for dependable COTSbased realtime embedded systems. In: RTSS, pp 481–491 Google Scholar
 64.Pike L, Goodloe A, Morisset R, Niller S (2010) Copilot: a hard realtime runtime monitor. In: RV. LNCS, vol 6418. Springer, Berlin, pp 345–359 Google Scholar
 65.Pike L, Niller S, Wegmann N (2011) Runtime verification for ultracritical systems. In: RV. LNCS, vol 7186. Springer, Berlin, pp 310–324 Google Scholar
 66.Puschner P (2002) Is worstcase executiontime analysis a nonproblem? – towards new software and hardware architectures. In: Proceedings of the 2nd Euromicro international workshop on WCET analysis, Department of Computer Science, University of York Google Scholar
 67.Reinbacher T, Brauer J (2011) Precise control flow reconstruction using boolean logic. In: EMSOFT. ACM, New York, pp 117–126 Google Scholar
 68.Reinbacher T, Brauer J, Horauer M, Steininger A, Kowalewski S (2011) Past time LTL runtime verification for microcontroller binary code. In: FMICS. LNCS, vol 6959. Springer, Berlin, pp 37–51 Google Scholar
 69.Reinbacher T, Brauer J, Horauer M, Steininger A, Kowalewski S (2012) Runtime verification of microcontroller binary code. Sci Comput Program (in press) Google Scholar
 70.Reinbacher T, Brauer J, Schachinger D, Steininger A, Kowalewski S (2011) Automated testtrace inspection for microcontroller binary code. In: RV. LNCS, vol 7186. Springer, Berlin, pp 239–244 Google Scholar
 71.Reinbacher T, Függer M, Brauer J (2013) Realtime runtime verification on chip. In: Qadeer S, Tasiran S (eds) RV. LNCS, vol 7687. Springer, Berlin, pp 110–125 Google Scholar
 72.Roşu G, Havelund K (2005) Rewritingbased techniques for runtime verification. Autom Softw Eng 12(2):151–197 CrossRefGoogle Scholar
 73.RTCA/DO178B (1992) Software considerations in airborne systems and equipment certification Google Scholar
 74.Schumann J, Srivastava A, Mengshoel O (2010) Who guards the guardians? Toward V&V of health management software. In: RV. LNCS, vol 6418, pp 399–404 Google Scholar
 75.Shimizu K, Dill DL, Hu AJ (2000) Monitorbased formal specification of PCI. In: FMCAD. Springer, Berlin, pp 335–353 Google Scholar
 76.Straka M, Kotásek Z, Winter J (2008) The design of hardware checkers for verification and diagnostic purposes. In: CSE, pp 320–327 Google Scholar
 77.Tabakov D, Rozier KY, Vardi MY (2012) Optimized temporal monitors for SystemC. Form Methods Syst Des 41(3):236–268 CrossRefzbMATHGoogle Scholar
 78.Thati P, Roşu G (2005) Monitoring algorithms for metric temporal logic specifications. Electron Notes Theor Comput Sci 113:145–162 CrossRefGoogle Scholar
 79.Tsai JJP, Fang KY, Chen HY, Bi Y (1990) A noninterference monitoring and replay mechanism for realtime software testing and debugging. IEEE Trans Softw Eng 16:897–916 CrossRefGoogle Scholar
 80.Watterson C, Heffernan D (2007) Runtime verification and monitoring of embedded systems. IET Softw 1(5):172–179 CrossRefGoogle Scholar
 81.Yang X, Chen Y, Eide E, Regehr J (2011) Finding and understanding bugs in C compilers. In: PLDI. ACM, New York, pp 283–294 Google Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.