Advanced virtual prototyping for cyber-physical systems using RISC-V: implementation, verification and challenges

Virtual prototypes (VPs) are crucial in today’s design flow. VPs are predominantly created in SystemC transaction-level modeling (TLM) and are leveraged for early software development and other system-level use cases. Recently, virtual prototyping has been introduced for the emerging RISC-V instruction set architecture (ISA) and become an important piece of the growing RISC-V ecosystem. In this paper, we present enhanced virtual prototyping solutions tailored for RISC-V. The foundation is an advanced open source RISC-V VP implemented in SystemC TLM and designed as a configurable and extensible platform. It scales from small bare-metal systems to large multi-core systems that run applications on top of the Linux operating system. Based on the RISC-V VP, this paper also discusses advanced VP-based verification approaches and open challenges. In combination, we provide for the first time an integrated and unified overview and perspective on advanced virtual prototyping for RISC-V.


Introduction
Modern Internet-of-Things (IoT) devices and cyber-physical systems (CPS) are highly interconnected devices that offer enormous innovations from the application perspective. However, from the design flow perspective the existing challenges are amplified. On the one hand the complexity of these systems is rising continuously to provide the necessary functionality and on the other hand several conflicting design requirements need to be dealt with simultaneously. Key requirements include a high processing efficiency in combination with real-time computing capabilities as well as extensive connectivity to provide smart functions. Furthermore, safety and security combined with a high reliability are indispensable. Yet, these devices have to be cheap and operate with limited resources in a very energy-efficient way. To meet these requirements, highly application specific solutions are required that are tailored for each specific system. In particular, optimizations and application specific extensions of the processor enable to bring huge benefits.
RISC-V [1,2], a free and open-source instruction set architecture (ISA), has significant potential to become a game changer in this area. RISC-V is designed from the ground up to be a highly configurable and extensible ISA. It offers a base integer instruction set in combination with several optional standard instruction set extensions which can be selected on a very fine granular basis. Addition of custom instructions, for example to boost performance or energy efficiency, is a built-in feature in RISC-V that is commonly used. This extensive modularity and enormous flexibility, combined with the fact that RISC-V is license-free and royalty-free, make RISC-V almost predestined to build highly efficient application specific processors to meet the ambitious requirements of next generation CPS and IoT devices. RISC-V is backed by an already capacious ecosystem that nonetheless is growing rapidly to evolve further. On the hardware (HW) side, several processor implementations are available (including open-source, free and commercial). On the software (SW) side, the ecosystem offers compilers, operating systems and simulators among others. Recently, virtual prototypes (VPs) have been introduced into the RISC-V ecosystem to complement the existing simulators in dealing with advanced system-level use-cases (such as design space exploration).
Leveraging VPs early in the design flow is a major industry-proven approach [3,4]. A VP is essentially an abstract executable model that is designed from the ground up to represent the entire HW platform. In industrial practice, VPs are predominantly created in SystemC transaction level modeling (TLM) [5,6]. VPs offer a middle ground between high-speed functional simulators (like QEMU) and register-transfer level (RTL) simulations, by being more accurate and faster, respectively. They facilitate SW development by supporting analysis of complex HW/SW interactions. Advanced SystemC-based techniques that consider extra-functional aspects (e.g., [7][8][9][10][11][12]) are applied to support other use-cases such as power and timing analysis based on the VP. Beside modeling and simulation, verification is crucial to avoid bugs that can lead to functional failure or even open security vulnerabilities that might be exploited. Due to their ease of use and scalability, simulation-based methods still form the main backbone of the verification effort. However, it is difficult to achieve a high verification quality with comprehensive coverage results and therefore sophisticated verification techniques are required.
In this paper, we present and discuss enhanced virtual prototyping solutions tailored for RISC-V. In particular, we build upon our recent previous studies in the context of RISC-V virtual prototyping [13][14][15][16][17][18][19][20][21][22][23][24], which consider different modeling and verification aspects in isolation, and for the first time show an integrated and unified view on advanced virtual prototyping for RISC-V 1) . The foundation is our RISC-V VP that is implemented in SystemC TLM and designed as a configurable and extensible platform (Section 4). It scales from small bare-metal systems to large multi-core systems that run applications on top of the Linux operating system. Our RISC-V VP is fully open source 2) (MIT license) to expand the ecosystem of RISC-V and stimulate further development and research. Our solution is the only freely available SystemC-based VP that is capable of booting Linux, to the best of our knowledge. Using our RISC-V VP as foundation, in this paper we discuss advanced verification approaches tailored for RISC-V that cover the major stages of a VP-based design flow. This includes verification of the VP (Section 5), the embedded SW (Section 6) and a VP-based cross-level methodology for RTL verification (Section 7). We also review related work (Section 2), provide a discussion on a modern VP-based design flow with a focus on the CPS/IoT domain (Section 8) and sketch ideas for future work as well as open challenges (Section 9). In combination, this paper for the first time shows an integrated and unified view on advanced virtual prototyping for RISC-V.

Related work
In this section, we review related work on RISC-V simulation and verification as well as verification of SystemC designs and embedded SW with a focus on formal techniques.

RISC-V simulation
Simulators are very important tools for RISC-V and hence the RISC-V ecosystem already has several different simulators that have been designed with different use-cases in mind to complement each other.
One important part is high-speed instruction set simulators (ISS) such as QEMU 3) (that meanwhile offers comprehensive RISC-V support), the official reference simulator SPIKE 4) or RV8 5) . Their primary use-case is high-speed simulation and for this reason they employ aggressive optimization techniques such as dynamic binary translation (from RISC-V to native x86 64). However, this makes integration of accurate models for extra-functional information much more challenging. 1) Our most recent RISC-V related approaches. http://www.systemc-verification.org/risc-v.
2) The GitHub link of our RISC-V VP as well as most recent RISC-V VP updates and related information. http://www. systemc-verification.org/riscv-vp.
Another direction is full-system simulators which include gem5 [25] and Renode 6) . The simulator gem5 is designed for architectural explorations. Therefore, gem5 provides detailed models, that can also be extended with extra-functional properties, for the memory and processor. Renode goes beyond the single system level and enables simulation of multiple embedded systems arranged as multi-node networks. However, neither gem5 nor Renode employ the standardized SystemC modeling style which precludes integration of advanced SystemC-based methods.
Yet another direction is approaches that aim to provide executable formalizations of the RISC-V ISA. FORVIS 7) and GRIFT 8) are implemented in Haskell. Beside being executable, they can serve as foundation for formal analysis techniques. SAIL-RISCV 9) is a sail-based implementation. Sail is a special language designed for describing ISAs. It offers support to generate simulation back-ends and definitions for theorem-prover.
A few approaches have been designed to work in combination with SystemC. ETISS [26] 10) is a configurable ISS that provides RISC-V support, leverages DBT to achieve a high performance and is implemented in C++. ETISS can be used standalone or integrated with a SystemC-based simulation. It complements our approach, as we conceptually could use ETISS as replacement ISS in our VP. A similar direction to ETISS is pursued by DBT-RISE, which is a generic framework to implement efficient DBTbased ISSs and offers RISC-V support as well 11) . The ISS is designed to be embedded in a SystemC-based VP platform 12) , which can be considered comparable to our effort. Another simulator implemented in SystemC TLM is RISC-V-TLM 13) . However, currently its support for the RISC-V ISA is very limited.
Finally, a set of commercial VP tools such as Mentor Vista or Synopsys Virtualizer is available, and may offer extensive RISC-V support, though their implementation is proprietary.
Our RISC-V VP is fully open source to expand the ecosystem of RISC-V and thereby provide a strong foundation for advanced system-level use-cases based on SystemC TLM-2.0, which is an industrial proven modeling standard (IEEE-1666). Also, our solution is the only freely available SystemC-based VP that is capable of booting Linux, to the best of our knowledge.

RISC-V verification
Recently, a handful of verification approaches tailored for RISC-V have emerged. The most basic approaches are the official RISC-V unit 14) and compliance 15) test-suites. They are easy to use and recent reports indicate that a very high quality has been reached regarding the compliance test-suite 16) . However, using a predefined test-suite, the comprehensiveness of testing is still limited. Therefore, automated test generation approaches have been designed for RISC-V. The Scala-based Torture Test generator 17) integrates predefined randomized test-sequences to generate tests. A combination of SystemVerilog with universal verification methodology (UVM) is leveraged by Google's RISCV-DV 18) to generate instruction streams which are specified using constrained-random descriptions. A commercial RTL simulator that provides support for SystemVerilog and UVM is required. In addition, in our previous studies we developed advanced test generation techniques for RISC-V that leverage fuzzing and constraint-based specifications [19][20][21][22]. Our techniques have been very effective in finding new bugs in RISC-V simulators and an industrial pipelined RISC-V core. In contrast to the aforementioned RISC-V simulation-based approaches, our techniques leverage advanced complementary test-generation techniques which enable them to find intricate bugs. We discussed these methods in Section 7.

SystemC verification
Simulation is still prevalent for SystemC verification due to its ease of use and scalability [5] 21) . Strong enhancements have been proposed to the basic simulation method by adding support for validation of TLM assertions, e.g., [36][37][38][39]. In the next step, approaches that utilize partial order reduction (POR) [40,41] have been proposed [42,43] to improve the simulation coverage further. They enable to efficiently explore all different process execution orders by pruning redundant orders. But still, it is necessary to provide representative inputs. To address this issue, formal verification approaches have been developed for SystemC TLM. The first approaches, e.g., [44][45][46][47], had problems in modeling the simulation semantics of SystemC accurately or suffered from severely limited scalability [48].
In the following, we briefly summarize the more recent formal verification approaches for SystemC. KRATOS [49] combines symbolic abstraction refinement with an explicit scheduler for efficient handling of cyclic state spaces. SCIVER [50] translates SystemC designs into sequential C models and applies stateof-the-art C model checker. SDSS [51] formalizes and encodes the complete state space of the SystemC design into an SMT formula. In [52], the approach was boosted with POR support. STATE [53] translates SystemC designs to timed automata and applies the UPPAAL model checker. In [54], a concolic testing approach tailored for bug hunting has been presented. Please refer to the survey [55] for more information on SystemC verification approaches.

Embedded SW verification
Most SW verification methods focus on non-embedded SW. This kind of SW has none or only limited interaction with the HW. KLEE [56] and SAGE [57] are representative candidates that made symbolic execution techniques applicable to large SW. Subsequent work focused on improving scalability further and mostly targeted binary level SW to obtain accurate results, e.g., S2E [58], Mayhem [59] or Angr [60]. Another active research area is verification of multi-threaded programs, e.g., [61], which has applications to interrupt verification as well [62].
To deal with embedded SW, specialized HW/SW symbolic/concolic co-validation approaches have been devised. Their primary difference is on the method for integration of the underlying HW. Virtual peripheral models that are extracted manually from QEMU are leveraged by [63,64]. In [65], instead HW Verilog models are used. In [66], a symbolic execution environment that is powered by KLEE and tailored for MSP430 microcontroller is provided. Physical devices are integrated by [67] to enable hybrid binary concolic testing. Our concolic testing approach specifically targets RISC-V embedded binaries (Section 6).

RISC-V
RISC-V is a very modular, configurable and extensible ISA. The foundation is the mandatory base integer (I) instruction set. It is available in 32, 64 and even 128 bit with corresponding register widths, denoted by RV32, RV64 and RV128, respectively. In addition, optional standard instruction set extensions such as multiply/divide (M), atomic operations (A), compressed instructions (C) and floating points with single (F) or double (D) precision are available. Designated instruction set encoding spaces are reserved for custom instruction set extensions. The standard instruction set combination is denoted by the single letter G = IMAFD. More information on the standard instruction sets is available in the RISC-V user-level ISA specification [1].
Another important part of the RISC-V ISA is the privileged architecture description [2]. It describes important concepts and instructions that are used for advanced operations such as trap/interrupt handling and operating system support. In particular, control and status registers (CSRs) play a central role here. CSRs are registers that serve a special purpose to interface between HW and SW. For example, mtvec is configured by the SW to store the trap handler address and mtval provides exception specific information to the SW in case of a trap. Beside the mandatory machine mode (M), the supervisor (S) and user (U), operation modes are defined to support different privilege levels.

SystemC and TLM
SystemC is a standardized C++ based modeling language that together with TLM is an industrial proven combination to build VPs [4]. At the heart of SystemC is the event-driven simulation kernel [5] that orchestrates the execution of processes. Processes are the foundation to describe behavior in SystemC. They are triggered by events and scheduled non-preemptively by the kernel, i.e., a process has to give back the control explicitly by calling wait. Modules and ports are leveraged to describe the structure of a SystemC design.
TLM transaction is the main communication method between different SystemC modules [72]. Compared with an RTL simulation, drastic speed-ups in simulation performance, i.e., up to a factor of 1000, are possible by this abstraction. A transaction object contains the necessary data to implement different memory access operations. This includes the access type (e.g., read or write) and address as well as the data (payload). Based on the address, the transaction is routed from the initiator to the target on a bus system. A delay can be passed alongside the transaction to obtain a more accurate estimation of the passed simulation time. All this is specified in the SystemC TLM-2.0 standard to ensure compatibility and interoperability of different components.

RISC-V VP implementation
In this section, we briefly summarize the main features of our open-source RISC-V VP. For a more in-depth discussion on the implementation aspects, please refer to our implementation paper [13]. Figure 1 (left side) gives an overview on the main architecture of our RISC-V VP. It is designed as configurable and extensible platform with a TLM 2.0 bus at the center. A 32 and 64 bit ISS with all RISC-V standard instruction set extensions is provided. Multiple ISSs can be instantiated to build a multi-core platform. Each ISS is attached to the bus via a memory interface. It is responsible to translate load and store instructions into TLM transactions. A memory management unit (MMU) is provided with the ISS to translate virtual to physical addresses. Interrupts are processed by RISC-V specific interrupt controllers (CLINT and PLIC in Figure 1). CLINT handles timer and SW interrupts, while PLIC deals with interrupts coming from other devices (it prioritizes and routes them to the ISS). An essential set of additional peripherals (e.g., UART, sensor) is provided as well. They are accessed through memory mapped I/O and can also be bus masters (like a DMA controller). Finally, a special system call handler is provided to (optionally) directly execute C/C++ library system calls. Our VP supports several operating systems including Zephyr, FreeRTOS, RIOT and Linux. Executable RISC-V binaries (ELF files) are loaded by means of an ELF loader component which is responsible to load the memory image and setup the program counter of the ISS.

Extension and configuration
Peripheral is attached to the central TLM 2.0 bus based on an address range mapping. This mapping is easily configurable. At the same time, additional peripherals can be easily integrated by attaching them to the bus system. This is very important to build different application specific platforms. A major use-case for RISC-V is integration of custom instruction set extensions to boost the overall efficiency. For this reason, the ISS can be extended accordingly with custom decode and execute functions for the instruction set.

HiFive1 configuration
One example that demonstrates the extensibility of our VP is the HiFive1 configuration. It resembles the HiFive1 board from SiFive and in particular is binary compatible with this board, i.e., the same RISC-V binary can be executed on the VP and the real board without modifications. The HiFive1 board integrates the FE310-G000 SoC 22) . It has an RV32IMAC core, code and data memories as well as numerous peripherals. This includes CLINT and PLIC-based interrupt controllers as well as UARTs and GPIOs for environment interaction. UART output is redirected to the console and for the GPIOs we provide an interprocess communication interface to enable access to an external environment model. As an example we build a Qt-based framework that supports graphical input and output components (such as buttons and LEDs) that can be attached to the GPIO interface.

Performance optimization
Performance optimizations are very important to boost the simulation speed and thus facilitate early SW development and testing. Therefore, we integrated two common optimizations designed for SystemCbased simulations.
(1) Direct memory interface (DMI) is utilized to speed-up memory access operations by bypassing the bus system. This includes fetch as well as load and store operations.
(2) Temporal decoupling enables to postpone context switches to the SystemC simulation kernel by utilizing a local time quantum to run ahead of the global simulation time. This is particularly useful inside of the ISS.

Eclipse-based SW debugging
Strong SW debugging capabilities are among the key features of a VP as they are paramount to investigate intricate errors. In addition, debugging at the VP-level provides reproducible result due to the deterministic simulation environment, thus making it even more valuable. We implemented the GDB remote serial protocol (RSP) interface to provide comprehensive SW debugging support in our VP. Graphical interfaces that support RSP can be directly attached, this includes the Eclipse IDE (see Figure 1(b)). Features include stepping through the SW (on the binary and source code level), setting breakpoints (w/o conditions) and accessing variables (reading and writing). In addition, support for debugging multi-core SW applications is provided as well.

Timing model
To support evaluation of extra-functional properties, corresponding models are integrated with the VPbased simulation. In the ISS, we provide an instruction-based timing model that allows to annotate fixed (though configurable) delays for each instruction type. This is a very generic timing model that has negligible performance impact and can be leveraged to obtain first approximate estimation on the SW execution times. In addition, more complex timing models can be integrated through specific interface mechanisms, which we demonstrated by providing a timing model designed to match the RISC-V 32 bit E31 core from SiFive [73]. These interface mechanisms enable to cover pipeline, branch prediction and caching effects which are important to obtain more accurate execution time estimations. However, this more accurate estimation comes at two costs: first, the simulation performance is noticeably affected by modeling these additional microarchitectural effects and second, such a timing model is no longer generic but needs to be provided specifically for the microarchitecture at hand.
As a complementary technique we investigated runtime adaptive simulations that enable to switch the accuracy setting at runtime through user defined configurations. This enables for example to skip the boot process of an operating system by using a fast simulation technique and focus on the actual application using a more accurate technique. To boost the simulation performance we investigated justin-time compilation techniques in this runtime adaptive setting [74].
In addition to the ISS, the bus system and peripherals play a very important role as well to obtain accurate timing estimation results, for example for the purpose of an early design space exploration. Currently, we employ the so called loosely-timed modeling style to integrate our peripherals on top of a generic TLM-2.0 bus system. For a timing estimation, TLM transactions can be annotated with optional delays to support more accurate timings of SW that interacts with peripherals. By using a generic bus system, the VP can also serve as a foundation to integrate the more accurate approximately-timed modeling style. This enables to obtain more accurate timing results at the cost of a lower simulation performance. A good compromise between simulation performance and timing estimation accuracy may be obtained by selectively refining specific communication protocols with certain peripherals. Moreover, for different architectures, such as ARM, hybrid solutions that integrate FPGA-based emulation with VP-based models, to obtain fast and accurate simulations, have been leveraged 23) . Such solutions are also applicable for RISC-V in general.

Verifying the VP
Extensive verification of the VP is crucial, because the VP serves as reference model for subsequent development steps and as platform for SW development. First, we describe the basic RISC-V test infrastructure and how we tested our VP (Subsection 5.1), and then we present a summary of our formal verification techniques tailored for SystemC TLM designs (Subsection 5.2) that we plan to utilize to formally verify our VP in the next steps.

Testing
To test our VP we leveraged existing test-suites, in particular the RISC-V unit tests and the modern compliance tests (we provide more details on compliance testing in Section 7). These test-suites essentially cover important basic functionality as well as several corner-case scenarios. However, their thoroughness is necessarily limited (because the test-suite has a fixed size). Therefore, we also utilized RISC-V test generation approaches, in particular: (1) the Scala-based Torture test generator that randomizes predefined instruction templates into new test sequences; and (2) a fuzzing-based test generation engine that creates instruction sequences in a coverage-driven way [22].  Figure 2 (Color online) Overview on the RISC-V Torture and CGF approach for VP testing.
While Torture focuses on positive testing, our fuzzing-based approach complements it with negative testing capabilities. Thus, in combination, a very solid verification of RISC-V based systems is achieved.
The test generation approaches (and the compliance tests) follow a signature-based test infrastructure. Figure 2 shows an overview. A test-suite is constructed that ultimately is compiled into executable RISC-V binaries (ELF files) that represent the test-cases. Each binary is loaded and executed on the simulator under test (our VP in this case) and a reference simulator (we used the official RISC-V reference simulator SPIKE), which both produce signatures. A signature is a memory dump that provides the test result. A difference in the signatures indicates a bug in the simulator under test.
In addition to test generation approaches, we developed and executed several complex SW applications on our VP. They are based on the Zephyr and FreeRTOS as well as Linux operating systems. Beside core components such as threads, interrupts, timers, message queues and semaphores, also several libraries such as FAT, UDP, SLIP and TinyCrypt are leveraged. Moreover, these applications utilize the fullplatform (including peripherals like a UART or interrupt controller) and not just the CPU core and memory. In this testing process, we observed that the applications behaved as expected.

Formal verification of SystemC designs
Testing is widely adopted because it is ease of use and scalability but at the same time inherently suffers from incompleteness of the testing process. Therefore, to prove correctness, formal verification techniques are indispensable. However, formal verification is very challenging, in particular for SystemCbased designs [24,75]. We identified three main challenges that need to be addressed.
(1) It has to deal with very large state spaces. The reason is that it has to consider all possible inputs and different process execution orders of the SystemC design in order to be complete.
(2) A typical SystemC design contains unbounded process loops that are triggered by events. Thus, a cycle detection mechanism is required to complete the verification process.
(3) Since SystemC is a C++ library, the verification engine requires to handle the full complexity of C++ in order to obtain a formal model for verification.
The first two challenges relate to the verifier back-end while the third challenge is a front-end issue. Therefore, to make the challenges more manageable, we introduced an intermediate verification language (IVL) (see [76]) to separate between both issues and focus on the back-end challenges henceforth (the IVL has been further extended in [77] to support additional language features tailored for TLM peripheral models). In short, the IVL is an open and compact language designed as formal intermediate representation for SystemC.

Stateful symbolic simulation
Based on the IVL, we developed a stateful symbolic simulation approach that combines several techniques to enhance the scalability in the back-end and thus address the remaining two challenges [23,78]. The foundation are three techniques that are integrated under the simulation semantics of SystemC.
(1) Symbolic execution (SymEx) to reason about a large number of different inputs and exploration paths very efficiently.
(2) POR to prune redundant scheduling sequences of SystemC processes.
(3) State subsumption reduction (SSR) to efficiently detect exploration cycles in symbolic explorations. While these techniques drastically boost the verification efficiency and provide a strong framework for SystemC verification, complex symbolic reasoning is still subject to scalability issues and therefore further optimizations remain to be highly important.

Optimization and application
One such optimization is compiled symbolic simulation (CSS) [79]. The idea is to leverage native execution to boost the exploration performance. To achieve this, the engine for symbolic execution is integrated with the process scheduler (that employs a POR-based optimization) into the SystemC design under test. Then, all components are compiled into a single binary, which enables native execution of the scheduled process orders. Performance can be further boosted by leveraging a parallelized exploration algorithm on top of CSS [80]. As a case-study, we reported verification results for a VP-based interrupt controller [77,80]. Please refer to our related work in Subsection 2.3 for an overview on other formal approaches for SystemC verification.
In the next steps, we plan to leverage (our) formal verification methods for SystemC to verify components of our RISC-V VP and ultimately the whole VP.  Figure 3). The test generator continuously provides new inputs to the execution engine. At the heart of the execution engine is the VP which is leveraged for executing the SW binary with each input. A VP-based execution enables to obtain very accurate verification results, because complex HW/SW interactions such as interrupts and peripheral interactions are modeled accurately and the SW is processed at the binary level. A tracer and checker component is attached to the VP to collect and process execution information and pass that feedback back to the test generator to guide the test generation process. Specified properties are checked alongside the VP-based execution.

Verification of embedded software binaries
Following this conceptual flow, we implemented three different approaches that leverage concolic testing (CT) [14,15], coverage-guided fuzzing (CGF) [16] and dynamic information flow tracking (DIFT) [17]. They trace symbolic constraints, execution coverage and information flow during the VP-based execution, respectively (right side of Figure 3). CT and CGF enable automated test generation specifically tailored to increase coverage for functional verification. CT leverages formal methods based on solving symbolic constraints while CGF employs scalable fuzzing techniques. Properties are specified in form of SW assertions. DIFT focuses on checking of security related properties, such as integrity and confidentiality, alongside the execution and thus complements test generation techniques. All these techniques have been shown to be very effective in the SW domain. However, using them on embedded SW binaries is challenging, because it requires to deal with architecture specific details and extensive interaction with HW peripherals. These challenges can be addressed by leveraging VPs. In the following, we briefly summarize the main idea of these three approaches in our VP-based context.

Concolic testing
CT works by tracking symbolic constraints on top of the concrete execution and solving these constraints to generate new inputs that represent new tests. Each test explores a different path through the SW program thus ultimately maximizing the SW path coverage. Symbolic constraints are also leveraged to check assertions and other potential error conditions alongside the execution.
Our VP-based approach [14,15] enables concolic testing for RISC-V binaries that extensively interact with peripherals. On the technical side, the RISC-V ISS and memory are instrumented to track symbolic constraints alongside the native execution and a specialized interface is provided to integrate SW models of peripheral. The interface is tailored for the requirements of SystemC-based peripherals. Our approach has been very effective in finding buffer overflows in the TCP/IP stack of FreeRTOS and a bug in the RISC-V specific memcpy function of the newlib C library.

Coverage-guided fuzzing
CT is a powerful verification technique but may be susceptible to scalability issues due to path explosion and complex symbolic constraints. Therefore, it is important to provide scalable simulation-based methods to complement CT. CGF is such a technique. Based on the principles of classical fuzzing [81], modern CGF continuously mutates randomly created data to produce new inputs and is guided by code coverage. Notable CGF representatives in the SW domain are the LLVM-based libFuzzer 24) and AFL 25) .
We combined SystemC-based VPs with CGF for verification of embedded RISC-V SW binaries [16]. The fuzzing process is guided by two coverage metrics to be more efficient: (1) from the embedded SW, and (2) from the SystemC-based peripherals of the VP. This allows to bridge the gap between SW and HW in the fuzzing engine. The VP tracks both coverage information while executing test-cases. Our approach has been very effective in analyzing real-world RISC-V embedded SW binaries (based on bare-metal systems and using Zephyr).

Dynamic information flow tracking
Protecting SW against security related exploits is more important than ever in today's highly interconnected devices. A powerful technique to enable such protection is DIFT [82,83]. DIFT tracks the flow of information alongside the SW execution from the inputs to the outputs of the system [84]. It allows to check that secret data is not leaked (confidentiality) and untrusted input does not influence sensitive data (integrity).
Our approach combines DIFT with VPs to enable early and accurate information flow analysis of embedded RISC-V binaries [17]. A major benefit of our approach is the virtually non-intrusive integration of the DIFT engine with the VP-based execution. In particular, we leverage C++ templates and operator overloading to achieve such a transparent integration. Our approach has been very effective in revealing security related exploits based on a car engine immobilizer case-study and in detecting code injections based on a standard benchmark set.

Cross-level compliance testing and verification
Compliance testing and verification are very important problems for RISC-V. Figure 4 shows an overview on our cross-level VP/RTL compliance testing (top) and verification (bottom) approaches. The idea is that the ISS of the VP serves as reference model for the RTL implementation, which is a common setting in a VP-based design flow. We will discuss our approaches in Subsections 7.2 and 7.3, respectively. We review related work in Subsection 2.2. In the following, we first start with a brief motivation on the role and importance of compliance testing for RISC-V and delimit it from design verification (Subsection 7.1).

Motivation on compliance testing for RISC-V
As stated in the introduction, RISC-V is a highly modular ISA that offers extensive configuration options and can be extended with custom instruction sets to build very application specific processors. However, as a direct implication it becomes very challenging to ensure that the RISC-V ecosystem as a whole is compatible with all the different processors. Too much customization can lead to the point, that SW incompatibilities between RISC-V implementations are introduced which in turn cause fragmentation of the ecosystem. This crucial problem has been recognized by the RISC-V foundation and therefore the compliance task group has been formed to develop efficient methods for compliance testing 26) . In contrast to design verification, which attempts to find bugs in the processor and ultimately prove correctness of the full functional behavior, compliance testing focuses on checking the relevant parts for the HW/SW interface to ensure compatibility of the processor with the RISC-V SW ecosystem. Thus, compliance testing for example checks if some registers are missing or have incorrect widths, it also checks available   Figure 4 (Color online) Cross-level co-simulation for verification purposes.
modes and access types, as well as basic sanity checks on the instructions (including if they are absent). This is very important to ensure that highly customized RISC-V processors from different vendors can benefit from the extensive RISC-V ecosystem. The official approach for compliance testing that is pursued by the compliance task group is to build a designated compliance test-suite for each RISC-V standard instruction set. While recent reports indicate that a high quality has been reached by the compliance test-suite for the base integer instruction set, the compliance testing problem for RISC-V is still far from being solved. Efficient compliance test generation methods are required to obtain comprehensive results. Finally, compliance testing complements design verification but does not replace it. Thorough verification has still to be performed in the subsequent design flow steps.

Compliance testing
Ultimately we follow the official approach on compliance testing but generate alternative test-suites that enable more comprehensive compliance testing. Essentially, two steps are involved: (1) generating a compliance test-suite together with a set of reference signatures (i.e., selected result register and memory values), and (2) using the generated test-suite to check compliance of an RISC-V simulator or RTL core under test as shown in Figure 4 (top). Following this methodology, we developed three complementary approaches for test-case generation that target compliance testing from the positive and negative testing perspective: (1) Coverage requirements for the final test-suite are specified in combination with instruction constraints in a domain specific language. All requirements are solved in combination with the constraints using an SMT solver to generate a compliance test-suite [21]. This offers a strong foundation for positive testing.
(2) Complements our first approach by utilizing mutation-based testing. In a first step, a set of mutation classes, tailored for the RISC-V ISA, is defined. In the second step, symbolic execution techniques are leveraged to generate a test-suite that kills all mutations in the ISS (that implements the RISC-V ISA) [85].
(3) Here the focus is on negative testing to further complement the positive testing approaches. Negative testing attempts to reveal bugs by focusing on illegal instructions and other inputs that can cause exceptions. The rational is to ensure that no additional behavior is accidentally added in an RISC-V implementation. We leverage fuzzing-based techniques that are guided by different coverage metrics [20].
In combination, our approaches offer a strong compliance testing framework and significantly enhance the existing official test-suites. Based on our approaches, we found new bugs in several RISC-V simulators.

Cross-level verification
Thorough verification is crucial to detect bugs and prove their absence. Formal verification techniques at processor level still suffer from a high complexity and potentially limited scalability. For this reason simulation-based methods that leverage randomized testing techniques still form the backbone of the verification effort.
In this context, we proposed an efficient approach [19] for RISC-V processor verification in an RTL/ISS cross-level setting. At the heart of our approach is an instruction stream generator (ISG) that generates an endless and dynamically evolving stream of instructions. This instruction stream is fed in a cross-level co-simulation setup to the reference ISS and RTL core under test, as shown in Figure 4 (bottom). After each instruction the resulting architectural state is compared between both models. Our approach offers three major benefits compared to traditional test generation approaches.
(1) No control flow related restrictions is placed on the instruction stream. In particular, infinite loops (or trap loops) are no problem, because the instruction stream evolves dynamically at runtime (hence a different instruction can be returned for the same program counter).
(2) Restarts, that reset the execution state before executing a new test, are not necessary. By using a single endless instruction stream, very long test sequences, which can be very important to find intricate bugs, are possible.
(3) A very high performance is achieved by placing the ISS and RTL core in a tight co-simulation within a single process. Furthermore, instructions are directly injected into the processor fetch interface making the processing step very efficient. We observed more the 200 million processed instruction per hour on a standard laptop.
Our approach has been very effective in finding several serious bugs in an industrial pipelined 32 bit RISC-V core. Most bugs were related to the RISC-V privileged architecture, particularly CSRs. Testing CSRs is a challenging task which has been mostly neglected by existing RISC-V verification frameworks. In Subsection 2.2, we discuss related work on RISC-V verification in more detail.

Application in the IoT/CPS domain
Modern IoT and CPS devices have certain properties that make their design flow very challenging: They integrate complex hardware and software components; they are often resource constrained devices that need to operate efficiently under tight energy consumption requirements; they operate and extensively interact with the physical environment; they provide intelligent functions and a high degree of connectivity; they have stringent requirements on robustness as well as safety and security.
Therefore, it is very important to provide methods for design space exploration, parallelize the HW and SW development and start early with integration and verification efforts. For this reason, VPs play an essential role in the design flow for such embedded systems by providing a platform for early SW development and serving as an executable reference model for the subsequent design flow steps.
A VP provides a single unified view which covers the SW, HW and environment in a single execution environment. By being an SW model written in the C++ based SystemC language, a VP provides very extensive introspection capabilities that cover the SW, HW and environment levels. A very important feature in particular for CPS is the ability to configure and access the execution state on a very fine granular basis during the execution. This strongly facilitates testing and debugging of CPS. On one hand the execution can be stopped, including the physics simulation of the environment (which is not possible in a physical setting), on the other hand very specific configurations can be setup and tested much more easily and quickly (which can be very cumbersome and expensive to setup in a physical setting). Moreover, to cover robustness aspects, which are crucial for IoT and CPS, error effect simulations can be conducted at the VP level in a straightforward way. Essentially, it works by injecting errors such as single bit flips in registers and observing their effects on the simulation outcome. Such an error effect simulation is very fast, as it is performed on a C++ model, but also very accurate, as the VP provides the complete HW interface in a way it is relevant to the SW execution and environment interaction. By using snapshots, testing and debugging can be further improved on the VP level. Moreover, components at the VP level can be replaced by refined RTL implementations or even connected with physical devices to enable hybrid simulations which can be very important for CPS/IoT devices that strongly interact with the physical environment.
Based on our open source RISC-V VP, such a development flow for CPS/IoT devices is enabled for the modern RISC-V architecture. Moreover, our VP can also serve as a platform for further research and education in these areas. As mentioned in Section 4 our RISC-V VP in particular also covers the SW, HW, and environment levels. On the SW side, we do support complex stacks with libraries and operating systems such as FreeRTOS, RIOT, Zephyr and Linux. On the HW side, the VP is a configurable and extensible platform that scales from single-to multi-core devices. We provide an example configuration for the HiFive1 board from SiFive including a virtual environment for an example scenario that enables to provide inputs and visualize outputs. An Eclipse-based debugging environment enables fine granular introspection of the execution state and control of the simulation.

Challenges and future work
Our RISC-V based VP is implemented in SystemC TLM-2.0 and already provides a significant set of features, which makes our VP a strong foundation for several application areas. Among them are early SW development and the ability to analyze complex HW/SW interactions for RISC-V based systems. In addition, our verification approaches tailored for RISC-V to verify the VP, the embedded SW and the RTL in a VP cross-level setting, show very promising results and integrate seamlessly with the VP-based design flow. Nonetheless, our VP and verification approaches can still be improved and extended. In the following, we discuss our plans for future work and sketch the main challenges to further enhance virtual prototyping solutions for RISC-V.
• Verification. Formal verification methods are crucial to obtain correctness proofs but they are still highly susceptible to scalability problems. Thus, boosting their scalability is a very important research direction. Therefore, domain specific optimizations and efficient combination of different verification techniques such as symbolic execution with state space abstraction and reduction techniques are investigated (Subsection 2.3). The major goal is to avoid state space explosion as much as possible and thus pave the way for applying formal verification methods to the whole VP. We believe that our proposed stateful symbolic simulation for SystemC (Subsection 5.2) provides a very solid foundation to seek this goal. Another way to improve scalability is to integrate existing simulation-based techniques with formal methods (e.g., coverage-guided fuzzing with symbolic simulation) to create a single unified verification approach that combines the benefits of both worlds. Such a unified combination enables to cover the state space very efficiently by providing a deep and broad cover. The idea is to switch seamlessly and intelligently at runtime between formal and simulation-based verification techniques on demand to achieve the best possible utilization of available resources. Advanced exploration strategies can be devised to speed-up bug hunting further and increase coverage more efficiently. An important point in this direction is also the development of stronger coverage metrics, e.g., [86], that combine code coverage with functional coverage and take complex features such as interrupts or threads into account. The challenges in formal SW verification are conceptually similar. Therefore, these solutions apply as well in the SW context to improve handling of large state spaces. Existing SW verification methods mainly leverage symbolic execution techniques (Subsection 2.4).
• Fast and accurate simulation. VPs should provide a high simulation performance (to deal with complex SW) and at the same time yield accurate timing results (to do performance evaluations and optimizations), which are two conflicting requirements in general. Different research directions are available to tackle this problem. Source level timing simulations (SLTS) attempt to instrument precise timing information into the SW, which is then compiled and natively executed on the host system (e.g., [87,88]). An alternative is to leverage DBT-based techniques to speed-up the VP-based simulation via native execution (e.g., [89,90]). However, sophisticated analysis techniques are required to instrument precise timing information and accurate modeling of complex HW/SW interactions such as interrupts becomes very challenging. Another research direction is to leverage runtime adaptive simulations that can dynamically (and ultimately intelligently) switch between fast and accurate execution modes at runtime, e.g., [91,92]. A use-case would be to perform a fast boot of an operating system and then a precise analysis of an SW driver. Beside performance evaluations, these methods are also applicable for other non-functional properties such as power consumption.
• Cross-level methodology. The VP serves as a reference platform for subsequent development steps in the design flow and in addition is an executable high-level model that offers fast simulation performance combined with strong debugging and configuration capabilities. An important research direction is to investigate a cross-level methodology that brings together the VP level and RTL. This allows to re-use VP-based information at RTL and enhance VP-based methods with RTL information. For example, in [93] an RTL to TLM correspondence analysis is presented that can be used to speed-up a VP-based error effect simulation. Another approach switches between RTL and VP simulation model at runtime to achieve that [94]. Starting from the VP, the approach in [95] derives an RTL property set based on TLM properties as starting point for RTL model checking. It is a step towards building a fully automated and scalable cross-level methodology that enables to utilize VP verification results and provide them for the RTL. Our presented cross-level compliance testing and verification approaches (Section 7) are a step in this direction as well. They can be further improved by leveraging stronger coverage metrics for the test-suite generation and application of formal methods for the VP/RTL equivalence check. Finally, high-level synthesis (HLS) techniques are an important building block to streamline the design flow for embedded systems. However, their applicability is still limited.
• Security. Security is a crucial aspect that will become even more important in the highly connected next generation embedded systems. Therefore, the VP-based design flow should be augmented to consider security related aspects and thus enable early and accurate evaluation of security policies that reason about data integrity and confidentiality. Our VP-based DIFT is a very promising combination to achieve this goal. However, beside DIFT-based monitoring security policies it is necessary to devise efficient and scalable verification techniques and coverage metrics that are tailored for security policies and complement existing functional verification. Another interesting direction to support DIFT would be to automatically learn the data flow relations, e.g., based on neural networks [96], within a SystemC-based peripheral to avoid a manual DIFT integration into every peripheral.
• Mixed-signal. Beside security, another important aspect that should be considered early in the VPbased design flow is mixed-signal support in particular from the verification perspective (e.g., [97,98]). The goal here would be to provide a unified and comprehensive verification environment that can reason about digital and analog components in combination. This complementary research direction would further broaden the verification scope to support highly heterogeneous systems.
• RISC-V. The RISC-V ISA adds a set of additional challenges that are inherently rooted in the design decisions made for RISC-V. Being an open and royalty-free ISA that has been specifically designed to be highly configurable and extensible, RISC-V is particularly suited to build highly efficient application specific solutions that include only the necessary features combined with custom domain specific extensions. This unrestricted design freedom has implications on the RISC-V ecosystem and adds unique challenges to verification solutions (for example, compliance testing is extremely important and challenging). Considering the VP-based design flow, in particular the integration of custom instruction extensions will be very important and should be supported at all levels ranging from specification to generation of simulation models (functional and non-functional) as well as verification.
Finally, in the best case, ultimately all these aspects should be integrated into a single unified VPbased framework that cross-integrates and connects all approaches. For example, the integration of AMS models should work together with the DIFT-based security approaches and be compatible with the HLS as well as be supported by the formal verification technique. This amplifies the existing challenges even further. A compositional approach that efficiently combines all separate building blocks might be a viable solution to tackle this problem.

Conclusion
In this paper, we presented advanced virtual prototyping techniques tailored for RISC-V in a unified overview. The foundation is our open-source RISC-V VP that is implemented in SystemC TLM and designed as a configurable and extensible platform. It scales from small bare-metal systems to large multi-core systems that run applications on top of the Linux operating system. Based on our VP, we discussed and reviewed advanced verification techniques that are designed to verify the VP itself, the embedded RISC-V SW and the RTL implementation in a VP-based cross-level setting. Finally, we sketched promising directions for future work and open challenges in the context of virtual prototyping for RISC-V.