figure a
figure b

1 Tool Overview

R2U2 (Realizable Responsive Unobtrusive Unit) is a modular framework for hardware (FPGA) and software (C and C++) real-time runtime verification (RV). R2U2 runs online, during system execution, with minimal overhead. (It also runs offline, over simulated data streasms or recorded data logs.) R2U2 is stream-based; given a runtime requirement \(\varphi \) and an input computation \(\pi \) of sensor and software values at each timestamp i, R2U2 returns the verdict (true or false) for all i as to whether \(\pi , i \models \varphi \). We call this output stream an execution sequence [34]; it is a stream of two-tuples \(\langle verdict, time \rangle \) for every time i. R2U2 encodes specifications as observers (a set of which we call a configuration) via an optimized algorithm with published proofs of correctness, time, and space [18, 20, 34].

Fig. 1.
figure 1

Workflow for verifying a specification using R2U2. Red shaded boxes denote runtime components and blue shaded boxes denote design-time components. Note that for validation, the runtime components can run offline, e.g., by replacing the data stream with a log file of simulated data. Users formalize their system requirements as MLTL formulas within a C2PO specification, use C2PO to generate an R2U2 configuration, then monitor the verdicts R2U2 outputs based on the configuration and data stream. (Color figure online)

Figure 1 depicts a standard R2U2 workflow. To integrate R2U2 into a target system, we first need a validated set of runtime requirements. Given the system’s resource constraints, the Configuration Compiler for Property Organization (C2PO) creates an optimized encoding of the input set of requirements as an R2U2 configuration. Users can swap configurations monitored by R2U2 at runtime, during system execution, based on system state, mission phase, or to upgrade the specification version – all without recompiling and redeploying the R2U2 engine, a key feature for systems that require onerous code change certifications, or e.g., systems that need to be launched into space and then dynamically updated as their hardware degrades.

R2U2 fills the unique gap in the RV community described by its name [39]:

  • Realizability R2U2 analyzes generic, re-usable specifications in Mission-Time Linear Temporal Logic (MLTL) [20, 34], a variant of LTL with closed integer-bounded intervals on the temporal operators. MLTL excels at capturing requirements conceptualized as timelines, as is common in aerospace operational concepts, e.g., [1, 11, 45]. At its core, R2U2 specifications combine either a future-time or past-time MLTL formula with simple signal comparators [34]. New optional extensions provide additional features, such as simple set-level reasoning [5]. R2U2’s hardware implementation, written in VHDL, avoids overburdening limited computing resources by utilizing Field Programmable Gate Arrays (FPGAs) to monitor in parallel with the system under absolute timing guarantees. R2U2’s two software implementations avoid hardware integration and software instrumentation challenges at the cost of (minimal) compute resources on the host system and are designed to be suitable for different environments. The C version forgoes memory allocation and bounds checking to provide fast deterministic results for real-time controllers under stringent certifiability criteria; alternatively, the C++ version makes full use of dynamic memory, templates, and runtime checks for maximum flexibility without monitor tuning. Additionally, the implementations differ significantly in architecture to provide fault independence. The three monitor implementations enable on-board (embedded) and on-ground execution, integration with multiple human-machine interaction paradigms, cross-validation, or triple modular redundancy voting strategies to increase system trust.

  • Responsiveness R2U2 provides two levels of responsiveness. At a system level, runtime reconfiguration of the monitor without a lengthy re-compilation (and re-certification) process keeps R2U2 responsive to the system’s needs even as the mission, platform, or requirements evolve. At a specification level, R2U2’s asynchronous (event-triggered) observers provably report both true and false verdicts (rather than only reporting property violations) in the first timestamp where there is sufficient information to evaluate \(\pi , i \, \models \, \varphi \), thus monitoring integrity, safety, and security requirements in real-time. Since the monitor’s response time is a function of the specification and known a priori, higher-level autonomous system health and decision-making controllers can rely on R2U2 verdicts to provide a tight bound on mitigation triggering or other reactive behaviors.

  • Unobtrusiveness R2U2’s multi-architecture, multi-platform design enables effective runtime verification while respecting crucial unobtrusiveness properties of embedded systems, including functionality (no change in behavior), certifiability (bounded time and memory under safety cases), timing (no interference with timing guarantees), and tolerances (respect constraints on size, weight, power, bandwidth, and overhead). R2U2 obeys unobtrusiveness constraints, provably fitting into tight resource limits and operational constraints frequently encountered in space missions. It can operate without code instrumentation or insight into black-box subcomponents such as ITAR, restricted, or closed-source modules [29].

User Base. After an extensive survey of all currently-available verification tools, NASA’s Lunar Gateway Vehicle System Manager (VSM) team selected R2U2 for operational verification [8,9,10]; R2U2 is currently operating in the NASA core Flight System/core Flight Executive (cFS/cFE) [28] VSM environment. R2U2 is embedded in the space left over on the FPGA controlling NASA’s Robonaut2’s knee to provide real-time fault disambiguation [18], interfacing via the Robot Operating System (ROS) [31]. R2U2 is running on a UAS Traffic Management (UTM) system [5], where it recently detected a flight-plan timing fault. JAXA is running R2U2 on a 2021 autonomous satellite mission with a requirement for a provable memory bound of 200KB [30]. R2U2 recently verified a CubeSat communications system [24], an open-source UAS [16], a sounding rocket [15], and a high-altitude balloon [23]. The CySat-I satellite uses R2U2 for autonomous fault recovery [2]. In the recent past, R2U2 was used in NASA’s Autonomy Operating System (AOS) for UAS [22] (where it flew on NASA’s S1000 octocopter [21]), the NASA Swift UAS [13, 34, 36, 43], and the NASA DragonEye UAS [41, 44]. R2U2 aided in NASA embedded system battery prognostics [42] and a case study on small satellites and landers [35]. R2U2 has also proven useful for monitoring and diagnosis of security threats on-board NASA UAS like the DragonEye [27, 40]. R2U2 was cataloged by the user community in a 2018 taxonomy of RV tools [12, 39], and appeared in a 2020 Institute of Information Security (ETH Zürich, Switzerland) case study [33]. R2U2 is open-source, dual licensed under MITFootnote 1 and Apache-2.0.Footnote 2

Table 1. Overview of changes to the R2U2 specification syntax for a basic temperature limit requirement, where Temp is located at index 0 of the input signal vector. This is not an exhaustive comparison but covers directly equivalent features, while Fig. 2 and the remainder of Sect. 2 detail new capabilities.

2 Compiler and Specification Language

Specification is a notoriously difficult aspect of RV [37]; verification results are only meaningful if the input specifications are correct and complete with respect to the system requirements. An RV engine is only usable if system engineers can validate that it monitors its given requirements as they expect, so they can clearly explain when and why different RV verdicts occur. In consultation with outside groups using R2U2 on real systems [8, 14, 30], we developed a new specification language and an accompanying formula-set compiler. The language’s and compiler’s features make specifications easier to read and write, improving user productivity and easing validation to address the challenges of specification in RV.

2.1 New Specification Language

Previous versions of R2U2 used a specification language derived from the implementation of the hardware runtime engine. While sufficiently expressive for the creation of R2U2 configurations, it utilized a restricted syntax that supported only basic MLTL operators and single-operator expressions over non-Boolean data types. Writing specifications that are transparent and easy to validate could be difficult without in-depth knowledge of R2U2’s architecture [17].

The new SMV-inspired [26] specification language allows users the option to write specifications more naturally with support for compound expressions over complex data types including sets and C-like structs as well as sections for defining structs, variables, macros, and MLTL formulas. C2PO supports Boolean, struct, and parametric set types with configurable integer and floating point types. To run R2U2 in software, users select a C standard type for each of the integer and float types e.g., an unsigned 16-bit integer (uint16_t) and double-precision floating point (double). If targeting hardware (FPGA implementation), users can configure integer and float types to a bit-width supported by the target system. Table  1 presents a comparison between the old [39] and new syntaxes and Fig. 2 presents a sample file for monitoring a request-handling system.

Fig. 2.
figure 2

Sample C2PO specification file using structs (lines 2–3, 12–13), sets (lines 3, 15–16), and set aggregation operators (lines 22–23). The specification on lines 19–20 captures the English requirement, “The active times for \(rq_0\) and \(rq_1\) shall differ by no more than 10.0 s,” and the specification on lines 22–26 captures the English requirement, “For each request r of each arbiter in ArbSet, r’s status shall be GRANT or REJECT within the next 5 s and until then shall be WAITING.”

To create an R2U2 configuration, C2PO generates an Abstract Syntax Tree (AST) representation of the input, performs type checking, applies optimizations and rewriting rules, then outputs the corresponding R2U2 configuration. R2U2 does not use automata to encode temporal logic observers (as reported erroneously elsewhere [12]); instead C2PO traverses the AST to produce assembly-like imperative evaluation instructions for the R2U2 monitor to executed at runtime.

In order to meet the demands of a wide range of systems, R2U2 Version 3.0 includes many optional features that are specific to one of the three implementations that can be enabled during system integration. For example, the Booleanizer module computes arbitrary non-Boolean expressions in the C implementation of R2U2, but this feature is not an option in the C++ or hardware implementations. C2PO allows users to enable or disable such features according to the capabilities of their target systems and chosen R2U2 implementation.

2.2 Assume-Guarantee Contract Support

Assume Guarantee Contracts (AGCs) provide a template for structuring and validating complex requirements in aerospace operational concepts [3]. AGCs feature a guard or trigger clause called the “assumption” and a system invariant called the “guarantee;” they have been used to structure both English and formal (e.g., temporal logic) requirements by projects including the NASA Lunar Gateway Vehicle System Manager [10]. R2U2 V3.0 now directly supports AGCs with an input syntax for expressing AGCs in C2PO and an output format for R2U2 that provides granular interpretation of verdicts, as presented in [17]. The input syntax for declaring an AGC is where the semantics for this logical implication provides three distinct cases: the AGC is “inactive” if the assumption is false, “true” if both the assumption and guarantee are true, and “false” otherwise. When the optional AGC feature is enabled, R2U2 produces three-valued verdicts to represent the state of the AGCs in a clear format; otherwise R2U2 interprets logical implications in the standard way (where \(\texttt {false} \rightarrow \texttt {true} \) results in the verdict \(\texttt {true} \) rather than inactive).

2.3 Set Aggregation

A common pattern in real-world specifications applies an identical formula to various input signals, such as testing all temperature sensors for an overheat condition. A naive encoding of these specifications in MLTL can be excessively large to the point of obscuring intent while providing ample opportunity for copy-paste errors, typos, or incomplete updates to variables – all of which are difficult for humans to spot during validation. C2PO mitigates this issue by supporting set aggregation operators that compactly encode these expressions as sets of streams with a predicate applied to each element [14].

To illustrate, consider the specification in Fig. 2. The direct encoding of this specification without the “foreach” operator is

figure d

Contrast this with the more compact encoding using the “foreach” operator on lines \(22-26\) in Fig. 2. The latter retains the intent of the English-level requirement while being semantically equivalent to the direct encoding. This concise representation both eases validation by improving readability and reduces the potential for errors by avoiding replicated values that require simultaneous updates.

Fig. 3.
figure 3

R2U2 Configuration Explorer web application: 1) C2PO specification input; 2) C2PO options; 3) C2PO output; 4) AST visualization; 5) AST node data; 6) R2U2 instruction; 7) C engine speed and memory calculator; 8) FPGA speed and size calculator; 9) FPGA design size vs maximum timestamp value.

2.4 Common Subexpression Elimination

C2PO uses an AST as the intermediate representation of its input and can therefore use optimization techniques common in compiler design such as Common Subexpression Elimination (CSE) [6]. Similarly to applying the isomorphism elimination rule for Binary Decision Trees [4], Common Subexpression Elimination (CSE) prunes all but one instance of any identical AST subtrees, reusing the result from that subtree for monitoring multiple requirements without wasting memory and execution time by representing it redundantly. Analysis of CSE on randomly-generated MLTL requirements resulted in a speed-up of \(37\%\) and required \(4.3\%\) less memory [18]. We expect larger savings in human-authored requirement specifications, however, due to reuse of both common specification patterns and structures in the underlying system. For example, a non-trivial subexpression might represent a system’s confidence in its navigational fix and many specifications might depend on the navigation state, thus re-using this subexpression.

3 Resource Estimation GUI

As R2U2’s user base expands, so does the variance in the domain expertise of these specification authors; R2U2 V3.0 therefore enables resource-aware requirements specification by users without experience with the performance trade-offs of syntactically different but semantically equivalent temporal logic encodings. The R2U2 Configuration Explorer is a web application that provides visual feedback from C2PO about the resource costs of specifications, e.g., in the form of MLTL formulas; see Fig. 3. With a short feedback loop on critical parameters like execution time, memory, and relative formula size, all a user needs to understand is what resources are available on their target system (not R2U2 itself) to write performant specifications that fit the available resources.

3.1 C2PO Feedback

Feedback from C2PO (elements 1–6 in Fig. 3) allows users to visualize the intermediate representation of a given input specification as well as the effects of optimizations and options on their final R2U2 configurations. Properties such as the memory required to represent specifications with differently-sized temporal intervals, or syntactically different but functionally similar checks, can be unintuitive for users to compute on the fly. The AST visualization provides transparency into this process for users unfamiliar with R2U2’s implementation via an interactive web-based interface suited to experimentation with different variations of a possible specification.

3.2 Software Resource Calculator

The software resource calculator (element 7 in Fig. 3) provides users of the R2U2 software implementations with an estimate of the time and memory required to evaluate one time step of a specification in the worst case.

Software Worst-Case Execution Time. The highly optimized nature of R2U2’s software implementations makes runtime performance highly dependent on the target platform’s architecture, C/C++ compiler version, and make environment factors; e.g., the length of the current working directory name can impact cache alignment. We use a simplified computing model to provide an estimation of the computing speed based on the number of CPU cycles required for each operation on the target platform. Users can edit these clock cycle values in the GUI, e.g., to test for platform-specific latencies. The estimated worst-case execution time (WCET) in software \(W_{sw}\) of an AST node g is:

$$\begin{aligned} W_{sw}(g) = \sum _{c \in \mathbb {C}_g}(W_{sw}(c)) + Cycles(g.type) \end{aligned}$$
(1)

where \(\mathbb {C}_g\) are the children nodes of g and Cycles is a dictionary mapping AST node types to a corresponding number of clock cycles. For instance, \(Cycles(\wedge ) = 10\) cycles by default.

Software Memory Requirements. R2U2 uses Shared Connection Queues (SCQs) to store verdict-timestamp pairs for each node in the AST. SCQs are single-writer, many-reader circular buffers that buffer the results of dependent temporal expressions that might not be evaluated at the same timestamp. The total SCQ size for a specification is the total number of SCQ slots required by the specification multiplied by the size of one slot. The required number of SCQ slots for a node g is:

$$\begin{aligned} size(g.Queue) = max(max\{s.wpd\text { | }\forall s \in \mathbb {S}_g\}-g.bpd,0)+1 \end{aligned}$$
(2)

where g.Queue is the output SCQ of g, s.wpd is the worst-case propagation delay of node s, s.bpd is the best-case propagation delay of node s, and \(\mathbb {S}_g\) is the set of sibling nodes of g. The propagation delays of a node represent the minimum and maximum number of time steps needed to evaluate the node and are defined recursively in Definition 4 of [18]. Intuitively, a node requires enough memory such that its results will not be overwritten before they are consumed by a parent node. The total SCQ memory of an AST is the sum of the sizes of SCQs of all nodes in the AST.

SCQ memory is an estimation of the actual total memory usage, but is typically the largest and most constraining memory type, e.g., as compared to instruction or pointer memory. The R2U2 C implementation statically fixes all memory sizes in advance to avoid dynamic allocation, so the SCQ sizing feedback is useful for: (1) selecting an initial size based on expected usage and; (2) verifying a configuration will fit on a deployed monitor with a fixed SCQ limit.

3.3 Hardware Resource Calculator

The hardware resource calculator (elements \(8-9\) in Fig. 3) provides estimations for hardware WCET (\(W_{hw}\)), total SCQ memory slots, and a graph for visualizing estimated FPGA resource requirements - Look-Up Tables (LUT) and Block RAMs (BRAM). Required resources depend on the type of FPGA architecture. The GUI accepts clock rate, LUT-type, timestamp length, and node sizing as parameters to better match the estimate to a target platform. This approach was validated on Virtex-5 and Zynq7000 FPGA platforms as well as the ACTEL ProASIC3L used for Robonaut2 in [18].

Hardware Worst-Case Execution Time. The GUI computes the estimated \(W_{hw}\) using a more precise method than in Sect. 3.2 by taking into account SCQ usage during execution. The R2U2 hardware implementation’s estimated worst-case execution time (\(W_{hw}\)) of an AST node g is:

$$\begin{aligned} \begin{aligned} W_{hw}(g) =&\sum _{c \in \mathbb {C}_g}(W_{hw}(c)) + Latency_{init}(g.type) \\&+ Latency_{eval}(g.type) * \sum _{c \in \mathbb {C}_g}(size(c.Queue)) \end{aligned} \end{aligned}$$
(3)

where \(Latency_{init}\), \(Latency_{eval}\) are dictionaries mapping AST node types to micro-second latencies corresponding to the initial and evaluation times of the node respectively. The multiplication accounts for evaluation of each buffered input from the child node, up to the queue size in the worst case.

Hardware Memory Requirements. The hardware resource calculator provides the explicit number of SCQ slots required for the collection of specifications in the specification set (aka configuration) using Formula 2 and summing sizes required for all AST nodes.

FPGAs use BRAMs to implement an R2U2 monitor’s SCQ memory, where the size and number of ports of the BRAMs limit the queue depth of the BRAMs. To compute the required number of BRAMs, let d be the total SCQ size, w be the bit width of each verdict-timestamp pair, \(w_{max}\) be the widest bit width the BRAM can accommodate, and D(w) be the maximum queue depth of a BRAM with verdict-timestamp pair bit width w. The required number of cascaded BRAMs is:

$$\begin{aligned} N_{BRAM}(w,d) = \lceil \frac{d}{D(w_{max})}\rceil *mod(w,w_{max}) + \lceil \frac{d}{D(rem(w,w_{max}))}\rceil \end{aligned}$$
(4)

Hardware LUT Requirements. Each R2U2 operator requires a constant number of comparator and adder/subtractor LUTs, configured by the user in the GUI. The GUI accounts for scaling based on the LUT type and uses the bit width of each verdict-timestamp pair w to estimate total LUT usage. The total number of required comparator LUTs (\(N_{cmp}\)) and adder/subtractor LUTs (\(N_{add}\)) are:

$$ N_{cmp}(w) = {\left\{ \begin{array}{ll} 4*w &{} \text { if LUT-3} \\ 2*w &{} \text { if LUT-4} \\ w &{} \text { if LUT-6} \end{array}\right. } \qquad N_{add}(w) = {\left\{ \begin{array}{ll} 2*w &{} \text { if LUT-3 or LUT-4} \\ w &{} \text { if LUT-6} \end{array}\right. } $$

4 Runtime Engine Improvements

To better serve mission-critical systems that must satisfy strict flight certification requirements (such as NASA’s VSM [8,9,10]), we have made a number of improvements to the internal architecture of the C version of R2U2 that provide memory assurances and flexibility as well as extended computational abilities. Figure 4 depicts this updated architecture.

Static Memory Arenas. The R2U2 V3.0 C version uses only statically-allocated memory. This avoids the many pitfalls of allocating memory (slow allocator calls, fragmentation, leaks, out-of-memory errors, etc.) and guarantees the amount of memory required for the entire execution of R2U2 up front. Additionally, many mission-critical systems either do not have or do not permit dynamic memory allocation, e.g., to satisfy requirements for flight certification [32]. R2U2 now runs unmodified on these platforms as well as traditional systems.

Each type of memory (yellow boxes of Fig. 4) has a predefined “arena” with a maximum size set during integration of the monitor with the target platform. When a user loads an R2U2 configuration, R2U2 fills the slots of these arenas in sequence until the arena is full.

Monitor Type Parameterization. Complimentary to the switch to static memory, the internals of the reasoning engine are now fully parameterized. A single header file allows users to adjust maximum values, bit widths, and even internal types. Proper tuning has performance benefits, but crucially allows users to fit R2U2 to use the exact amounts of resources available on a target system. For example, limiting the size of the gaps between timestamps, e.g., in cases where the specification will be either reset frequently or evaluated infrequently, allows more SCQs to fit in the same amount of memory permitting larger formula sets with functionally similar behavior.

Arbitrary Data Flow. R2U2 initially worked as a stack of engines, at each timestamp passing results from the Atomic Checker (AT) to the Temporal Logic engine (TL), then passing the TL verdicts through the Bayesian Network (BN) layer to produce that time-stamp’s verdict [34]. Now, R2U2 can connect these engines in any order. This simplifies configuration generation from the perspective of C2PO, enabling arbitrary ordering of instructions. Atomic checker properties can now accept results of temporal logic formulas as input, for example, without adding a confusing step delay in the verdict stream.

Fig. 4.
figure 4

Internal architecture of an R2U2 monitor. Orange boxes are streams of data, yellow boxes are memory arenas, and blue boxes are modules. Arrows entering and exiting blue boxes denote read and write relationships respectively. The red arrows denote relationships that are only active upon startup i.e., when R2U2 populates instruction memory and configures SCQ memory. (Color figure online)

AT Checker Extended Mode. The C version of the atomic checker has an extended mode allowing for additional comparisons and filters beyond the standard hardware-compatible set. In extended mode, the atomic checker produces Boolean “atomics” from conditionals, where each conditional compares the result of a filter to either a constant or another input signal. Filters are predefined functions such as simple data type casts (bool, int, float, etc.) or mathematical functions like rate, moving average, or absolute angle difference. For example:

  • a5 := abs_diff_angle(s3,105) <​ 50; checks if the absolute difference between the data of signal 3 and the value 105 when treated as angles is below 50.

  • a43 := int(s32) == s33; checks that the values of signals 32 and 33 are in agreement when treated as integers.

Booleanizer. The R2U2 V3.0 C implementation includes a new general-purpose computing module that uses a three-address code representation [7] called the Booleanizer that can take the place of the AT checker. This module enables arbitrary expressions over non-Boolean data types using arithmetic, bitwise, and relational operators as well as extended set aggregation operators such as “forexactlyn” or “foratmostn” operators.

5 Discussion

R2U2’s toolchain now provides an effective means by which to formalize, validate, and verify system requirements in real time, giving users control and transparency of the memory and feature set of their target-specific monitors. We have combined the collection of capabilities from previously-published R2U2 case studies into one modular, centralized implementation that we have rigorously evaluated for correctness (e.g., using [19, 38]).

C2PO and its new specification language enable higher-level abstractions for users that make the specification development process faster, more transparent, and less reliant on a deep understanding of R2U2’s underlying algorithms. The new GUI front-end allows up-front specification design and resource usage estimation by system designers so that users can rapidly prototype specifications before downloading and using R2U2. These improvements make specifying, validating, and monitoring system requirements easier and more accessible to the systems that stand to benefit most from RV. Since specification is the biggest bottleneck to formal methods and autonomy [37], this is an important feature for an RV engine.

It is now much easier to integrate R2U2 into production environments, like NASA cFS/cFE [25, 28] or ROS [31], due to the unified front end compiler, expanded engine capabilities, and better user tooling. Recently R2U2 has launched on several real-life, full-scale air and space missions, largely enabled by these advancements. This major upgrade lays a solid foundation for expanded RV capabilities and integration into a wider array of missions and embedded architectures.