Application Runtime Estimation for AURIX Embedded MCU Using Deep Learning

Fricke, Florian; Scharoba, Stefan; Rachuj, Sebastian; Konopik, Andreas; Kluge, Florian; Hofstetter, Georg; Reichenbach, Marc

doi:10.1007/978-3-031-15074-6_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13511))

Included in the following conference series:

International Conference on Embedded Computer Systems

1270 Accesses

Abstract

Estimating execution time is a crucial task during the development of safety-critical embedded systems. Processor simulation or emulation tools on various abstraction levels offer a trade-off between accuracy and runtime. Typically, this requires detailed knowledge of the processor architecture and high manual effort to construct adequate models. In this paper, we explore how deep learning may be used as an alternative approach for building processor performance models. First, we describe how to obtain training data from recorded execution traces. Next, we evaluate various neural network architectures and hyperparameter values. The accuracy of the best network variants is finally compared to two simple baseline models and a mechanistic model based on the QEMU emulator. As an outcome of this evaluation, a model based on the Wavenet architecture is identified, which outperforms all other approaches by achieving a mean absolute percentage error of only 1.63%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp. 265–283 (2016)
Google Scholar
Adileh, A., González-Álvarez, C., Ruiz, J.M.D.H., Eeckhout, L.: Racing to hardware-validated simulation. In: 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 58–67. IEEE (2019)
Google Scholar
Bellard, F.: QEMU, a fast and portable dynamic translator. In: USENIX Annual Technical Conference, FREENIX Track, vol. 41, p. 46. California, USA (2005)
Google Scholar
Buber, E., Diri, B.: Performance analysis and cpu vs gpu comparison for deep learning. In: 2018 6th International Conference on Control Engineering Information Technology (CEIT), pp. 1–6 (2018). https://doi.org/10.1109/CEIT.2018.8751930
Burger, D., Austin, T.M.: The simplescalar tool set, version 2.0. ACM SIGARCH computer architecture news 25(3), 13–25 (1997)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Eeckhout, L.: Computer architecture performance evaluation methods. Synthesis Lectures Comput. Architecture 5(1), 1–145 (2010)
Article Google Scholar
El Hihi, S., Bengio, Y.: Hierarchical recurrent neural networks for long-term dependencies. In: Advances in Neural Information Processing Systems, pp. 493–499 (1996)
Google Scholar
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Article Google Scholar
Eyerman, S., Eeckhout, L., Karkhanis, T., Smith, J.E.: A mechanistic performance model for superscalar out-of-order processors. ACM Trans. Comput. Syst. (TOCS) 27(2), 1–37 (2009)
Article Google Scholar
Gers, F.A., Schmidhuber, J.: Recurrent nets that time and count. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, vol. 3, pp. 189–194. IEEE (2000)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.orgwww.deeplearningbook.org
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hofmann, J., Alappat, C.L., Hager, G., Fey, D., Wellein, G.: Bridging the architecture gap: abstracting performance-relevant properties of modern server processors. arXiv preprint arXiv:1907.00048 (2019)
Hönig, T., Herzog, B., Schröder-Preikschat, W.: Energy-demand estimation of embedded devices using deep artificial neural networks. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. SAC ’19 (2019). https://doi.org/10.1145/3297280.3297338
Infineon Technologies AG: AURIX TC27x D-Step User’s Manual, December 2014
Google Scholar
Infineon Technologies AG: AURIX TC3xx User’s Manual, February 2021
Google Scholar
Kang, S., Yoo, D., Ha, S.: TQSIM: a fast cycle-approximate processor simulator based on QEMU. J. Syst. Architect. 66–67, 33–47 (2016). https://doi.org/10.1016/j.sysarc.2016.04.012
Article Google Scholar
Lauterbach GmbH: TriCore Debugger and Trace (2021)
Google Scholar
Luo, Y., Li, Y., Yuan, X., Yin, R.: QSim: framework for cycle-accurate simulation on out-of-order processors based on QEMU. In: 2012 Second International Conference on Instrumentation, Measurement, Computer, Communication and Control, pp. 1010–1015 (2012). https://doi.org/10.1109/IMCCC.2012.397
Mendis, C., Renda, A., Amarasinghe, D., Carbin, M.: Ithemal: accurate, portable and fast basic block throughput estimation using deep neural networks. In: Proceedings of the 36th International Conference on Machine Learning (2019)
Google Scholar
Nicolescu, G., Mosterman, P.J.: Model-based design for embedded systems. Crc Press (2018)
Google Scholar
Nussbaum, S., Smith, J.E.: Modeling superscalar processors via statistical simulation. In: Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques, pp. 15–24. IEEE (2001)
Google Scholar
Oord, A.v.d., et al.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)
Powell, D.C., Franke, B.: Using continuous statistical machine learning to enable high-speed performance prediction in hybrid instruction-/cycle-accurate instruction set simulators. In: Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis, pp. 315–324 (2009)
Google Scholar
Rachuj, S., Fey, D., Reichenbach, M.: Impact of performance estimation on fast processor simulators. In: Song, H., Jiang, D. (eds.) SIMUtools 2020. LNICST, vol. 370, pp. 79–93. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72795-6_7
Chapter Google Scholar
Reichenbach, M., Knödtel, J., Rachuj, S., Fey, D.: RISC-V3: a RISC-V compatible CPU with a data path based on redundant number systems. IEEE Access 9, 43684–43700 (2021). https://doi.org/10.1109/ACCESS.2021.3063238
Article Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Article Google Scholar

Download references

Acknowledgement

We would like to express our gratitude to Elektronische Fahrwerksysteme GmbH for supporting this work. Furthermore, we are grateful to Lauterbach GmbH for their loan of an Off-chip Serial Trace device. This project would not have been feasible without such a device.

Author information

Authors and Affiliations

BTU Cottbus-Senftenberg, Cottbus, Germany
Florian Fricke, Stefan Scharoba, Andreas Konopik & Marc Reichenbach
Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
Sebastian Rachuj & Andreas Konopik
Elektronische Fahrwerksysteme GmbH, Gaimersheim, Germany
Andreas Konopik, Florian Kluge & Georg Hofstetter

Authors

Florian Fricke
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Scharoba
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Rachuj
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Konopik
View author publications
You can also search for this author in PubMed Google Scholar
Florian Kluge
View author publications
You can also search for this author in PubMed Google Scholar
Georg Hofstetter
View author publications
You can also search for this author in PubMed Google Scholar
Marc Reichenbach
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florian Fricke .

Editor information

Editors and Affiliations

University of California, La Jolla, CA, USA
Alex Orailoglu
BTU-Cottbus Senftenberg, Cottbus, Germany
Marc Reichenbach
Fraunhofer IESE, Kaiserslautern, Germany
Matthias Jung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fricke, F. et al. (2022). Application Runtime Estimation for AURIX Embedded MCU Using Deep Learning. In: Orailoglu, A., Reichenbach, M., Jung, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2022. Lecture Notes in Computer Science, vol 13511. Springer, Cham. https://doi.org/10.1007/978-3-031-15074-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-15074-6_15
Published: 14 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15073-9
Online ISBN: 978-3-031-15074-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Application Runtime Estimation for AURIX Embedded MCU Using Deep Learning