Pushing the Limits of Parallel Discrete Event Simulation for SystemC

Dömer, Rainer; Cheng, Zhongqi; Mendoza, Daniel; Arasteh, Emad

doi:10.1007/978-3-030-47487-4_7

Rainer Dömer²,
Zhongqi Cheng²,
Daniel Mendoza² &
…
Emad Arasteh²

4654 Accesses

Abstract

The IEEE SystemC language is widely used in industry and academia to model and simulate system-level designs. Despite the availability of multi- and many-core host processors, however, the Accellera reference simulator is still based on sequential discrete event simulation, utilizing only a single core at any time. While many advanced parallel simulation approaches have been proposed, most require modification of the SystemC source code so that the model is free from parallel access conflicts and rely on the designer to manually perform this difficult transformation.

The Recoding Infrastructure for SystemC (RISC) project addresses the parallel SystemC simulation problem with automatic compiler-based analysis and source code transformation. A dedicated SystemC compiler and corresponding parallel simulator provide safe static analysis and recoding, and thus automatically achieve fast parallel simulation of SystemC models.

We share the RISC framework as open source in order to enable easy evaluation, foster collaboration, and further extend our proof-of-concept implementation.

You have full access to this open access chapter, Download chapter PDF

Parallel Simulation

PSML: parallel system modeling and simulation language for electronic system level

Article 12 November 2018

Keywords

7.1 Introduction

The IEEE standard SystemC language [13] is widely used for the specification, modeling, validation, and evaluation of electronic system level (ESL) models. The Accellera Systems Initiative maintains not only the official SystemC language definition, but also provides an open source proof-of-concept library that can be used to simulate SystemC design models [1]. However, implementing the classic scheme of discrete event simulation (DES), this reference simulator runs sequentially and cannot utilize the parallel computing resources available on multi- and many-core processor hosts. This severely limits the execution speed of SystemC simulation.

In order to provide faster execution, parallel discrete event simulation (PDES) [8, 12] techniques can be applied. While significant obstacles exist specifically for the SystemC language [7], many parallel simulation approaches have been proposed [5, 11, 19, 21,22,23,24]. Beyond these synchronous PDES techniques, out-of-order PDES [6] is even more aggressive. By localizing the simulation time to individual threads and carefully handling events at different times, the simulator engine can issue threads in parallel and ahead of time, following a partial ordering without loss of accuracy. This results in better exploitation of the available parallelism and thus maximum simulation speed.

The Recoding Infrastructure for SystemC (RISC) project described in this paper implements out-of-order PDES for the IEEE SystemC language as open source. Specifically, RISC provides a dedicated SystemC compiler and corresponding out-of-order parallel simulator [2, 8, 16]. Compared to the other approaches, RISC automatically analyzes the SystemC source code, identifies all potential race conditions, and then instruments the model to prevent any conflicts. This transformation does not require any manual recoding or application-specific knowledge.

We share our RISC proof-of-concept implementation with the EDA community as an open source software project in order to facilitate evaluation, promote parallel SystemC simulation, and achieve fruitful collaboration [3, 4].

7.2 RISC Framework

While the RISC software framework may be used for many other analysis and transformation tasks on SystemC models, parallel simulation is the main purpose. To perform semantics-compliant parallel simulation with out-of-order scheduling, we introduce a dedicated SystemC compiler that works hand in hand with a new simulator. This is in contrast to the traditional SystemC simulation flow where a SystemC-agnostic C+ + compiler includes the SystemC headers and links the design model directly against the Accellera reference library.

As shown in Fig. 7.1, the RISC compiler acts as a frontend that processes the input model and generates an intermediate model with special instrumentation for conflict-free parallel execution. The instrumented model is then linked against the extended RISC SystemC library by the target compiler (a regular C+ + compiler, such as GNU gcc or Intel icpc) in order to produce the output executable model. Out-of-order parallel simulation is then performed simply by running the generated executable model.

From the user perspective, we simply replace the regular C+ + compiler with the SystemC-aware RISC compiler (which in turn calls the underlying C+ + compiler). Otherwise, the overall SystemC validation flow remains the same as the traditional tool flow. Simulation is just faster due to the parallel execution. Note also that this process is fully automated. No user interaction or manual code transformation is necessary.

7.2.1 RISC Compiler

In order to produce a safe parallel model, the RISC compiler performs three major tasks, namely segment graph construction, conflict analysis, and finally source code instrumentation.

7.2.1.1 Segment Graph Construction

A segment graph (SG) [6] is a directed graph that represents the source code segments executed during the simulation between scheduling steps. More specifically, every segment is associated with a corresponding scheduler entry point, namely a wait statement in SystemC. All other statements in the SystemC source code become part of those segment nodes where they are executed when the wait statement resumes its execution.

The segment graph construction is a fully automatic but complex process which we will not describe here (see [6] for detailed coverage). However, the RISC compiler must parse the SystemC input model first into an Abstract Syntax Tree (AST). Since SystemC is a syntactically regular C+ + code, RISC relies here on the ROSE compiler infrastructure [18]. The ROSE internal representation (IR) provides RISC with a powerful C/C+ + compiler foundation that supports AST generation, traversal, analysis, and transformation.

As illustrated with the RISC software stack shown in Fig. 7.2, the RISC compiler then builds a SystemC IR on top of the ROSE IR which accurately reflects the SystemC structures, including the module and channel hierarchy, port connectivity, and other SystemC-specific constructs. On top of the SystemC IR, the compiler architecture then builds the Segment Graph generator and data structures, as well as all other RISC analysis and transformation functions.

7.2.1.2 Conflict Analysis

The segment graph data structure serves as the foundation for segment conflict analysis. At run time, the scheduler in the simulator must ensure that every parallel thread to be issued has no conflicts with any other threads currently in the READY and RUN queues. For this we use the RISC compiler to detect any possible conflicts between these threads already at compile time.

Potential conflicts in SystemC include data hazards, event hazards, and timing hazards, all of which may exist among the segments executed by the threads considered for parallel execution. Again, we refer to [6] for a detailed discussion of these hazards and their static or dynamic detection in RISC. However, we note that if the hazards would be ignored, this would lead to race conditions at run time and jeopardize the correctness of the SystemC simulation.

7.2.1.3 Source Code Instrumentation

As a result of the conflict analysis, the RISC compiler generates a set of conflict and timing tables that describe all possible hazards between any two threads. Using this conservative conflict information, the simulator can then at run time quickly determine by a simple table look-up whether or not it is safe to issue a given thread in parallel or ahead of time.

As shown above in Fig. 7.1, the RISC compiler and simulator work closely together. The compiler performs conservative conflict analysis and passes the analysis results to the simulator which then can make safe scheduling decisions quickly.

To pass information from the compiler to the simulator, we use automatic source code instrumentation. That is, the intermediate model generated by the compiler contains instrumented (automatically generated) code which the simulator can then safely rely on.

At the same time, the RISC compiler also instruments the SystemC wait statements with corresponding segment ID and furnishes user-defined channels with automatic protection against race conditions among communicating threads.

7.2.2 RISC Simulator

The RISC simulator supports out-of-order discrete event simulation (OoO PDES) [6] for fast SystemC simulation. In OoO PDES, we break the strict order of time (the synchronous barrier) by localizing time stamps to each thread. Since each thread has its own time stamp, the OoO PDES scheduler relaxes the event and simulation time updates, allowing more threads (at different simulation cycles) to run in parallel and ahead of time. This results in a higher degree of parallelism and thus higher simulation speed. We are using advanced static compile-time analysis to identify all such potential conflicts. Based on this information (a simple table look-up is sufficient), the OoO PDES scheduler can then at run time quickly decide whether or not a set of threads has any conflicts with each other.

7.2.3 RISC Analysis and Transformation Tools

As an example of other SystemC analysis tools built on top of RISC, visual [17] enables the user to visualize the SystemC module hierarchy. It supports a graphical user interface implemented with the Gtk API and renders a specified SystemC source file’s module hierarchy, which is drawn using the Cairo API. The tool obtains module data from the SystemC IR in the RISC software stack which contains information about nested modules and thus can recursively iterate through nested lists of child modules in order to obtain enough information to visualize the hierarchy of the entire SystemC source file. The input SystemC source file may contain thousands of lines of code which can make manually drawing a representation of the modules, ports, and channels described by the code a difficult and time-consuming task. Thus the visual tool was created to address this issue. It can automatically generate a visual representation of a SystemC model in a very short period of time. Figure 7.3 shows the module visualization of a Canny edge detector application.

7.3 Experiments

We will now evaluate the performance of the RISC simulator. The following experiments show the speedup on an Intel Xeon Phi^TM Coprocessor 5110P many-core architecture. The coprocessor contains 60 cores where each core has a vectorization unit of 512 bit. To obtain unambiguous measurements, we turn CPU frequency scaling off for all experiments.

7.3.1 Mandelbrot Renderer

The Mandelbrot renderer is a parallel video application to compute the Mandelbrot set. Basically, the device under test (DUT) hosts a number of renderer units. Each unit computes a different slice of the Mandelbrot image. At compile time, the user defines how many slices are available.

Figure 7.4 shows the simulation results [20]. Due to the minimal communication needs in this application, highest speedups are reached. The vectorization unit with 512 bit can execute up to eight double-precision floating-point operations in parallel. A speedup M of 6.9x is achieved. The thread-level parallelization increases strongly on the 60 cores with a speedup N of 50x. Afterwards, the speed slows down due to the 60 physical cores and use of hyper-threads. Notably, the combination of the thread and data level parallelization N × M generates a speedup of up to 212x.

7.4 RISC Open Source Project

We make the Recoding Infrastructure for SystemC (RISC) described in this article freely available online as a software artifact [9]. Generally, an artifact is a software program together with an applicable data set and test suite that accompanies a research publication for the purpose of independent evaluation.^{Footnote 1} The point here is that the proposed algorithms and data structures are made available as proof-of-concept implementation and can be used and evaluated by others. Experimental results may be replicated and validated. The proposed approach can also be compared against related work and in the presence of source code even be extended. Otherwise, great challenges are posed in repeatability [15].

Specifically, the presented RISC compiler and simulator are available as open source on the web [2] and can be used without restrictions (BSD license terms). RISC can be downloaded in both source code and binary format.

7.4.1 Open Source Code and Documentation

In its current version [4], the RISC open source package consists of approximately 162,000 lines of code and includes the C+ + source code for the RISC compiler and simulator, Linux build scripts and installation instructions, as well as comprehensive documentation of the compiler and simulator APIs and tool manual pages. Example SystemC models, such as an abstract DVD player and the Mandelbrot renderer, and a regression test bench are included as well.

Given a suitable Linux platform,^{Footnote 2} the RISC source code package can be easily installed and then tested. After downloading and adjusting the installation Makefile, a simple make all command builds and installs the RISC framework and runs several demo examples. The user can then fully evaluate the software with other SystemC examples and even extend our proof-of-concept implementation with new features.

7.4.2 Binary Image for “Plug-and-Play” Evaluation

For a quick test run without compilation and installation, we also provide a Docker container [3] for using RISC in “plug-and-play” fashion. The Docker image contains RISC (and all needed libraries) in binary format and allows the user to test it with just a few Linux commands, as shown in Fig. 7.5.

7.5 Conclusion

The Recoding Infrastructure for SystemC (RISC) provides an automatic compiler-based framework to analyze and simulate IEEE SystemC models in parallel. In particular, we have introduced the RISC compiler and simulator. Using automatic conflict analysis based on segment graph (SG) abstraction, OoO PDES can execute threads safely in parallel and out-of-order (ahead of time) and thus achieves fastest simulation speed but nevertheless maintains the classic SystemC modeling semantics. In order to foster collaboration in the EDA community, we provide the RISC framework as a free open source artifact for full evaluation and possible extension.

For the future, we intend to expand our open source efforts and hope to involve other members of the EDA community to use, evaluate, and extend the RISC framework.

Notes

1.
Because of its importance, artifact evaluation has been adopted as integral part of the review process in several computer science areas, such as Software Engineering and Programming Languages [10, 14].
2.
Red Hat Enterprise and CentOS Linux version 6 and 7 are verified to work.

References

Accellera Systems Initiative, Core SystemC Language and Examples. http://accellera.org/downloads/standards/systemc
Center for Embedded and Cyber-physical Systems, Recoding Infrastructure for SystemC (RISC). http://www.cecs.uci.edu/~doemer/risc.html
Center for Embedded and Cyber-physical Systems, RISC Docker Container. https://hub.docker.com/r/ucirvinelecs/risc050/
Center for Embedded and Cyber-physical Systems, RISC Release version 0.5.0. http://www.cecs.uci.edu/~doemer/risc.html#RISC050
W. Chen, X. Han, R. Dömer, Multi-core simulation of transaction level models using the system-on-chip environment. IEEE Des. Test Comput. 28(3), 20–31 (2011)
Article Google Scholar
W. Chen, X. Han, C.W. Chang, G. Liu, R. Dömer, Out-of-order parallel discrete event simulation for transaction level models. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 33(12), 1859–1872 (2014). https://doi.org/10.1109/TCAD.2014.2356469
Article Google Scholar
R. Dömer, Seven obstacles in the way of standard-compliant parallel SystemC simulation. IEEE Embed. Syst. Lett. 8(4), 81–84 (2016). https://doi.org/10.1109/LES.2016.2617284
Article Google Scholar
R. Dömer, G. Liu, T. Schmidt, Parallel simulation, in Handbook of Hardware/Software Codesign ed. by S. Ha, J. Teich (Springer, Dordrecht, 2017), pp. 1–32
Google Scholar
R. Dömer, Z. Cheng, D. Mendoza, A. Dingankar, RISC: recoding infrastructure for SystemC, open source framework for parallel simulation, in Workshop on Open-Source EDA Technology (WOSET) at ICCAD (2018)
Google Scholar
Evaluate Collaboratory, Artifact Evaluation. http://evaluate.inf.usi.ch/artifacts
P. Ezudheen, P. Chandran, J. Chandra, B.P. Simon, D. Ravi, Parallelizing SystemC kernel for fast hardware simulation on SMP machines, in PADS ’09: Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation (2009), pp. 80–87
Google Scholar
R. Fujimoto, Parallel discrete event simulation. Commun. ACM 33(10), 30–53 (1990)
Article Google Scholar
IEEE Computer Society, IEEE Standard 1666-2011 for Standard SystemC Language Reference Manual (IEEE, New York, 2011)
Google Scholar
S. Krishnamurthi, Artifact Evaluation Process. http://www.artifact-eval.org/
S. Krishnamurthi, J. Vitek, The real software crisis: repeatability as a core value. Commun. ACM 58(3), 34–36 (2015). https://doi.org/10.1145/2658987
Article Google Scholar
G. Liu, T. Schmidt, Z. Cheng, D. Mendoza, R. Dömer, RISC compiler and simulator, release V0.5.0: out-of-order parallel simulatable SystemC subset. Technical Report, CECS-TR-18-03, Center for Embedded and Cyber-physical Systems, University of California, Irvine (2018)
Google Scholar
D. Mendoza, R. Dömer, A tool for visualization of SystemC models. Technical Report, CECS-TR-17-06, Center for Embedded and Cyber-physical Systems, University of California, Irvine (2017)
Google Scholar
D.J. Quinlan, ROSE: compiler support for object-oriented frameworks. Parallel Process. Lett. 10(2/3), 215–226 (2000)
Article Google Scholar
C. Roth, S. Reder, H. Bucher, O. Sander, J. Becker, Adaptive algorithm and tool flow for accelerating SystemC on many-core architectures, in Digital System Design (DSD), 17th Euromicro Conference (2014)
Google Scholar
T. Schmidt, G. Liu, R. Dömer, Exploiting thread and data level parallelism for ultimate parallel SystemC simulation, in Proceedings of the Design Automation Conference (DAC) (2017)
Google Scholar
R. Sinha, A. Prakash, H.D. Patel, Parallel simulation of mixed-abstraction SystemC models on GPUs and multicore CPUs, in Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) (2012)
Google Scholar
N. Ventroux, T. Sassolas, A new parallel SystemC kernel leveraging manycore architectures, in Proceedings of the Design, Automation and Test in Europe (DATE) Conference (2016)
Google Scholar
J.H. Weinstock, R. Leupers, G. Ascheid, D. Petras, A. Hoffmann, SystemC-link: parallel SystemC simulation using time-decoupled segments, in Proceedings of the Design, Automation and Test in Europe (DATE) Conference (2016)
Google Scholar
D. Yun, J. Kim, S. Kim, S. Ha, Simulation environment configuration for parallel simulation of multicore embedded systems, in Proceedings of the Design Automation Conference (DAC) (2011), pp. 345–350
Google Scholar

Download references

Acknowledgements

This work has been supported in part by substantial funding from Intel Corporation for two projects titled “Out-of-Order Parallel Simulation of SystemC Virtual Platforms on Many-Core Architectures” and “Scaling the Recoding Infrastructure for Parallel SystemC Simulation.” The authors thank Intel Corporation for the valuable support.

Author information

Authors and Affiliations

Center for Embedded and Cyber-Physical Systems, University of California, Irvine, CA, USA
Rainer Dömer, Zhongqi Cheng, Daniel Mendoza & Emad Arasteh

Authors

Rainer Dömer
View author publications
You can also search for this author in PubMed Google Scholar
Zhongqi Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Mendoza
View author publications
You can also search for this author in PubMed Google Scholar
Emad Arasteh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rainer Dömer .

Editor information

Editors and Affiliations

Dortmund, Germany
Jian-Jia Chen

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dömer, R., Cheng, Z., Mendoza, D., Arasteh, E. (2021). Pushing the Limits of Parallel Discrete Event Simulation for SystemC. In: Chen, JJ. (eds) A Journey of Embedded and Cyber-Physical Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-47487-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-47487-4_7
Published: 31 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47486-7
Online ISBN: 978-3-030-47487-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Pushing the Limits of Parallel Discrete Event Simulation for SystemC

Abstract