Skip to main content
Log in

Deep Learning Inference with Dynamic Graphs on Heterogeneous Platforms

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript


One major drawback of deep-learning algorithms is the elevated cost of computing complexity and memory bandwidth required for inference. In order to ameliorate these costs in applications that utilize Convolutional Neural Networks (CNNs), a new, radical, approach is the dynamic pruning of kernels which aims to the parsimonious inference by learning to exploit and dynamically remove the redundant capacity of a CNN architecture. This conditional execution approach formulates a systematic and data-driven method for developing CNNs that are trained to eventually change size and form in real-time during inference, targeting to the smaller possible computational footprint. The conditional execution however, induces a number of challenges when it comes to the implementation of these algorithms to embedded systems. In this paper we present a systematic way of deploying this new dynamic pruning methodology, in heterogeneous platforms that facilitate both CPU and GPU subsystems. Realtime measurements of embedded implementations in modern SoCs verify the efficacy of the proposed methodology and demonstrate the ability of the dynamic networks to both adapt their size to the complexity of the task and deliver significant computational gains during inference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others


  1. Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).

    Article  MathSciNet  Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)

  3. Ignatov, A., Timofte, R., Chou, W., et al.: AI Benchmark: running deep neural networks on android smartphones. (2018). Last Revised 15 Oct 2018

  4. Knoblauch, A., Körner, E., Körner, U., Sommer, F.T.: Structural synaptic plasticity has high memory capacity and can explain graded amnesia, catastrophic forgetting, and the spacing effect. PLoS ONE 9(5), e96485 (2014).

    Article  Google Scholar 

  5. Bengio, E., Bacon, P.L., Pineau, J., Precup, D.: Conditional computation in neural networks for faster models. (2015). Last Revised 7 Jan 2016

  6. Theodorakopoulos, I., Pothos, V., Kastaniotis, D., Fragoulis, N.: Parsimonious inference on convolutional neural networks: learning and applying on-line kernel activation rules. (2017). Last Revised 31 Jan 2017

  7. Hu, H., Peng, R., Tai, Y.-W., Tang, C.-K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. (2016). Submitted on 12 Jul 2016

  8. Feng, J., Darrell, T.: Learning the structure of deep convolutional networks. IEEE Int. Conf. Comput. Vis. (ICCV) 2749–2757, 1135–1143 (2015)

    Google Scholar 

  9. Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 2074–2082 (2016)

  10. Yang, T.J., Yu-Hsin, C., Vivienne, S.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6071–6079 (2016)

  11. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 1135–1143 (2015)

  12. Figurnov, M., Vetrov, D., Kohl, P.: PerforatedCNNs: acceleration through elimination of redundant convolutions, (2015). Last Revised 16 Oct 2016

  13. Bossard, L., Guillaumin, M., Van Gool, L.: Food-101—mining discriminative components with random forests. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds.) Computer Vision—ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8694. Springer, Berlin (2014)

  14. Forrest, N.I., Song, H., et al.: SqueezeNet: AlexNet-level accuracy with 50 × fewer parameters and < 1 MB model size. (2016). Last Revised 4 Nov 2016

  15. Qualcomm Neural Processing SDK for AI ( Accessed 2019

  16. Irida Labs S.A. ( Accessed 2020

Download references


This work has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 780788.

Author information

Authors and Affiliations


Corresponding author

Correspondence to N. Fragoulis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pothos, V., Vassalos, E., Theodorakopoulos, I. et al. Deep Learning Inference with Dynamic Graphs on Heterogeneous Platforms. Int J Parallel Prog 49, 158–176 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: