We present an estimation methodology, accurately predicting the execution time for a given embedded Artificial Intelligence (AI) accelerator and a neural network (NN) under analysis. The timing prediction is implemented as a python library called (MONNET) and is able to perform its predictions analyzing the Keras description of an NN under test within milliseconds. This enables several techniques to design NNs for embedded hardware. Designers can avoid training networks which could be functionally sufficient but will likely fail the timing requirements. The technique can also be included into automated network architecture search algorithms, enabling exact hardware execution times to become one contributor to the search’s target function.
In order to perform precise estimations for a target hardware, each new hardware needs to undergo an initial automatic characterization process, using tens of thousands of different small NNs. This process may need several days, depending on the hardware.
We tested our methodology for the Intel Neural Compute Stick 2, where we could achieve an (RMSPE) below 21% for a large range of industry relevant NNs from vision processing.
- Execution time
- Neural networks
- Analytical model
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Benmeziane, H., Maghraoui, K.E., Ouarnoughi, H., Niar, S., Wistuba, M., Wang, N.: A comprehensive survey on hardware-aware neural architecture search (2021). https://doi.org/10.48550/ARXIV.2101.09336, https://arxiv.org/abs/2101.09336
Intel®: Openvino™. https://docs.openvino.ai/latest/get_started.html
Mori, P., et al.: Accelerating and pruning CNNs for semantic segmentation on FPGA. In: Design Automation Conference (DAC) (2022)
Parashar, A., et al.: Timeloop: a systematic approach to DNN accelerator evaluation, pp. 304–315, March 2019. https://doi.org/10.1109/ISPASS.2019.00042
Patterson, D.A., Hennessy, J.L.: Computer Organization and Design: The Hardware/Software Interface, 5th edn. (2013)
Siu, K., Stuart, D.M., Mahmoud, M., Moshovos, A.: Memory requirements for convolutional neural network hardware accelerators. In: 2018 IEEE International Symposium on Workload Characterization (IISWC), pp. 111–121 (2018). https://doi.org/10.1109/IISWC.2018.8573527
Sotiriou-Xanthopoulos, E., Percy Delicia, G.S., Figuli, P., Siozios, K., Economakos, G., Becker, J.: A power estimation technique for cycle-accurate higher-abstraction SystemC-based CPU models. In: 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), pp. 70–77 (2015). https://doi.org/10.1109/SAMOS.2015.7363661
Wess, M., Ivanov, M., Unger, C., Nookala, A., Wendt, A., Jantsch, A.: Annette: accurate neural network execution time estimation with stacked models. IEEE Access 9, 3545–3556, December 2020. https://doi.org/10.1109/ACCESS.2020.3047259
Yao, S., et al.: FastDeepIoT: towards understanding and optimizing neural network execution time on mobile and embedded devices. In: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems. ACM, November 2018. https://doi.org/10.1145/3274783.3274840
This publication was created as part of the research project KI Delta Learning (project number: 19A19013K) funded by the Federal Ministry for Economic Affairs and Energy (BMWi) on the basis of a decision by the German Bundestag.
Editors and Affiliations
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Osterwind, A., Droste-Rehling, J., Vemparala, MR., Helms, D. (2023). Hardware Execution Time Prediction for Neural Network Layers. In: Koprinska, I., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2022. Communications in Computer and Information Science, vol 1752. Springer, Cham. https://doi.org/10.1007/978-3-031-23618-1_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23617-4
Online ISBN: 978-3-031-23618-1