A JIT Compiler for Neural Network Inference

  • Felix ThielkeEmail author
  • Arne Hasselbring
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11531)


This paper describes a C++ library that compiles neural network models at runtime into machine code that performs inference. This approach in general promises to achieve the best performance possible since it is able to integrate statically known properties of the network directly into the code. In our experiments on the NAO V6 platform, it outperforms existing implementations significantly on small networks, while being inferior on large networks. The library was already part of the B-Human code release 2018 [12], but has been extended since and is now available as a standalone version that can be integrated into any C++14 code base [18].


  1. 1.
    Abadi, M., et al.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015).
  2. 2.
    Bradski, G.: The OpenCV library. Dr. Dobb’s Journal of Software Tools (2000)Google Scholar
  3. 3.
    Chollet, F., et al.: Keras (2015).
  4. 4.
    Fog, A.: Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs (2018).
  5. 5.
    Hermann, T., et al.: frugally-deep: Header-only library for using Keras models in C++ (2019).
  6. 6.
    Hofmann, M., Schwarz, I., Urbann, O., Larisch, A.: Nao Devils team report 2018 (2019).
  7. 7.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  8. 8.
    Kobalicek, P., et al.: AsmJit – Complete x86/x64 JIT and remote assembler for C++ (2019).
  9. 9.
    Nao-Team HTWK: Team research report 2018 (2019).
  10. 10.
    nyanp, et al.: tiny-dnn: header only, dependency-free deep learning framework in C++14 (2018).
  11. 11.
    Poppinga, B., Laue, T.: JET-Net: real-time object detection for mobile robots. In: Chalup, S., Niemueller, T., Suthakorn, J., Williams, M.-A. (eds.) RoboCup 2019: Robot World Cup XXIII. LNCS(LNAI), vol. 11531, pp. 227–240. Springer, Cham (2019)Google Scholar
  12. 12.
    Röfer, T., et al.: B-Human team report and code release 2018 (2018).
  13. 13.
    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. arXiv preprint arXiv:1801.04381 (2018)
  14. 14.
    Schraudolph, N.N.: A fast, compact approximation of the exponential function. Neural Comput. 11(4), 853–862 (1999). Scholar
  15. 15.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)Google Scholar
  16. 16.
    Szemenyei, M., Estivill-Castro, V.: Real-time scene understanding using deep neural networks for RoboCup SPL. In: Holz, D., Genter, K., Saad, M., von Stryk, O. (eds.) RoboCup 2018. LNCS (LNAI), vol. 11374, pp. 96–108. Springer, Cham (2019). Scholar
  17. 17.
    The HDF Group: Hierarchical data format, version 5 (1997–2019).
  18. 18.
    Thielke, F., Hasselbring, A.: CompiledNN: A JIT compiler for neural network inference (2019).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Fachbereich 3 – Mathematik und InformatikUniversität BremenBremenGermany

Personalised recommendations