Skip to main content
Log in

High-performance application mapping in network-on-chip-based multicore systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The allocation of resources and scheduling of tasks, specifically mapping, in multicore systems on-chip (MCSoC), poses significant challenges. Tasks have diverse resource requirements and interact with each other, while network-on-chip (NoC)-based MCSoC consists of heterogeneous cores and communication networks. The heterogeneity of resources in MCSoC, along with the varying computational and communication demands of applications, makes mapping a complex optimization problem. We have mathematically modeled the mapping problem in NoC-based MCSoC using mixed-integer linear programming (MILP) with the objective of minimizing the execution time of applications. This model incorporates computation and communication capacity, energy budget constraints of MCSoC, and execution time requirements of applications. We further propose heuristics, including simulated annealing (SA) and genetic algorithm (GA), considering the capacity and budget constraints of MCSoC systems to accelerate applications and provide quick mapping solutions. Simulation results demonstrate that the performance from SA and GA heuristics is very close (within 10%) to the optimal solutions from MILP across various applications for 2D-Mesh NoCs with 16–100 cores. The energy consumption of SA and GA heuristics is also very close to the optimal solutions from MILP, with a few exceptions on small-scale 16-core NoC. Additionally, SA outperforms GA in most cases for all applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Dean J, Patterson D, Young C (2018) A new golden age in computer architecture: empowering the machine-learning revolution. IEEE Micro 38(2):21–29. https://doi.org/10.1109/MM.2018.112130030

    Article  Google Scholar 

  2. Cerebras: Cerebras Wafer Scale Engine. https://www.cerebras.net/product/ (2019)

  3. Bohnenstiehl B, Stillmaker A, Pimentel JJ, Andreas T, Liu B, Tran AT, Adeagbo E, Baas BM (2017) Kilocore: a 32-nm 1000-processor computational array. IEEE J Solid-State Circuits 52(4):891–902. https://doi.org/10.1109/JSSC.2016.2638459

    Article  Google Scholar 

  4. TESLA: NVIDIA Tesla GPU. https://www.nvidia.com/en-us/data-center/v100/ (2007)

  5. Benini L, Micheli GD (2002) Networks on chip: a new paradigm for systems on chip design. In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition (DATE), pp 418–419. https://doi.org/10.1109/DATE.2002.998307

  6. Benini L, Bertozzi D (2005) Network-on-chip architectures and design methods. IEE Proc Comput Digit Tech 152(2):261–272. https://doi.org/10.1049/ip-cdt:20045100

    Article  Google Scholar 

  7. Micheli GD, Seiculescu C, Murali S, Benini L, Angiolini F, Pullini A (2010) Networks on chips: from research to products. In: Proceedings of ACM/IEEE Design Automation Conference (DAC), pp 300–305. https://doi.org/10.1145/1837274.1837352

  8. Kas M (2012) Toward on-chip datacenters: a perspective on general trends and on-chip particulars. J Supercomput 62(1):214–226. https://doi.org/10.1007/s11227-011-0703-4

    Article  Google Scholar 

  9. Bergman K et al (2008) ExaScale computing study: technology challenges in achieving exascale systems

  10. Shalf J, Dosanjh S, Morrison J (2010) Exascale computing technology challenges. In: Proceedings of the International Conference on High Performance Computing for Computational Science (VECPAR: Vector and Parallel Processing), pp 1–25

  11. Zahn F, Lammel S, Fröning H (2017) Early experiences with saving energy in direct interconnection networks. In: Proceedings of IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), pp 33–40. https://doi.org/10.1109/HiPINEB.2017.10

  12. Dennard RH, Gaensslen FH, Rideout VL, Bassous E, LeBlanc AR (1974) Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circuits 9(5):256–268. https://doi.org/10.1109/JSSC.1974.1050511

    Article  Google Scholar 

  13. Taylor MB (2013) A landscape of the new dark silicon design regime. IEEE Micro 33(5):8–19. https://doi.org/10.1109/MM.2013.90

    Article  Google Scholar 

  14. Cormen TH, Stein C, Rivest RL, Leiserson CE (2001) Introduction to algorithms, 2nd edn. McGraw-Hill Higher Education, Cambridge

    Google Scholar 

  15. CPLEX: IBM ILOG CPLEX Optimizer. https://www.ibm.com/analytics/cplex-optimizer (2022)

  16. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260. https://doi.org/10.1126/science.aaa8415 (https://www.science.org/doi/pdf/10.1126/science.aaa8415)

    Article  MathSciNet  Google Scholar 

  17. Agarwal N, Krishna T, Peh LS, Jha NK (2009) GARNET: a detailed on-chip network model inside a full-system simulator. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp 33–42. https://doi.org/10.1109/ISPASS.2009.4919636

  18. Binkert N et al (2011) The gem5 simulator. SIGARCH Comput Archit News 39(2):1–7. https://doi.org/10.1145/2024716.2024718

    Article  Google Scholar 

  19. Dick R (2008) Embedded system synthesis benchmarks suites (e3s). http://robertdick.org/tools.html

  20. Murali S, De Micheli G (2004) Bandwidth-constrained mapping of cores onto noc architectures. In: Proceedings Design, Automation and Test in Europe Conference and Exhibition (DATE), pp 896–901. https://doi.org/10.1109/DATE.2004.1269002

  21. Murali S, Benini L, De Micheli G (2005) Mapping and physical planning of networks-on-chip architectures with quality-of-service guarantees. In: Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference (ASP-DAC), pp 27–32. https://doi.org/10.1109/ASPDAC.2005.1466124

  22. Hu J, Marculescu R (2005) Energy and performance aware mapping for regular NoC architectures. IEEE Trans TCAD 24(4):551–562. https://doi.org/10.1109/TCAD.2005.844106

    Article  Google Scholar 

  23. Chen G, Li F, Son SW, Kandemir M (2008) Application mapping for chip multiprocessors. In: Proceedings of the 45th Annual Design Automation Conference (DAC), pp 620–625

  24. Chou C-L, Marculescu R (2007) Incremental run-time application mapping for homogeneous nocs with multiple voltage levels. In: Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pp 161–166

  25. Yang B, Guang L, Xu TC, Yin AW, Säntti T, Plosila J (2010) Multi-application multi-step mapping method for many-core network-on-chips. In: Proc. NORCHIP, pp 1–6. https://doi.org/10.1109/NORCHIP.2010.5669454

  26. Lu H, Yan G, Han Y, Fu B, Li X (2013) RISO: relaxed network-on-chip isolation for cloud processors. In: Proceedings of the 50th Annual Design Automation Conference (DAC), pp 1–6

  27. Fattah M, Ramirez M, Daneshtalab M, Liljeberg P, Plosila J (2012) Cona: dynamic application mapping for congestion reduction in many-core systems. In: Proceedings of IEEE 30th International Conference on Computer Design (ICCD), pp 364–370. https://doi.org/10.1109/ICCD.2012.6378665

  28. Haghbayan M-H, et al. (2015) Mappro: proactive runtime mapping for dynamic workloads by quantifying ripple effect of applications on networks-on-chip. In: Proceedings of the 9th IEEE/ACM International Symposium on Networks-on-Chip, pp 26–1268. ACM, New York, NY, USA. https://doi.org/10.1145/2786572.2786589

  29. Fattah M, Daneshtalab M, Liljeberg P, Plosila J (2013) Smart hill climbing for agile dynamic mapping in many-core systems. In: 50th ACM/EDAC/IEEE Design Automation Conference (DAC), pp 1–6

  30. Zhu D, Chen L, Pinkston TM, Pedram M (2015) TAPP: temperature-aware application mapping for NoC-based many-core processors. In: Proceedings of IEEE Design, Automation & Test in Europe Conference & Exhibition (DATE), pp 1241–1244. http://dl.acm.org/citation.cfm?id=2757012.2757100

  31. Chou C-L, Marculescu R (2008) Contention-aware application mapping for network-on-chip communication architectures. In: Proceedings of IEEE International Conference on Computer Design (ICCD), pp 164–169. https://doi.org/10.1109/ICCD.2008.4751856

  32. Chou C-L, Marculescu R (2008) User-aware dynamic task allocation in networks-on-chip. In: Proceedings of IEEE Design, Automation and Test in Europe (DATE), pp 1232–1237. https://doi.org/10.1109/DATE.2008.4484847

  33. Reza MF, Zhao D, Wu H, Bayoumi M (2018) Hotspot-aware task-resource co-allocation for heterogeneous many-core networks-on-chip. Comput Electr Eng 68(C):581–602

    Article  Google Scholar 

  34. Kanduri A, Haghbayan M, Rahmani AM, Shafique M, Jantsch A, Liljeberg P (2018) adboost: thermal aware performance boosting through dark silicon patterning. IEEE Trans Comput 67(8):1062–1077

    Article  MathSciNet  Google Scholar 

  35. Reza MF, Zhao D, Bayoumi M (2021) Energy-efficient task-resource co-allocation and heterogeneous multi-core NoC design in dark silicon era. Microprocess Microsyst. https://doi.org/10.1016/j.micpro.2021.104055

    Article  Google Scholar 

  36. Kanduri A, Haghbayan M, Rahmani A, Liljeberg P, Jantsch A, Tenhunen H (2015) Dark silicon aware runtime mapping for many-core systems: a patterning approach. In: Proceedings of 33rd IEEE International Conference on Computer Design (ICCD), pp 573–580. https://doi.org/10.1109/ICCD.2015.7357167

  37. Khdr H, Pagani S, Shafique M, Henkel J (2015) Thermal constrained resource management for mixed ILP-TLP workloads in dark silicon chips. In: Proceedings of ACM/IEEE Design Automation Conference (DAC), pp 1–6. https://doi.org/10.1145/2744769.2744916

  38. Manna K, Mukherjee P, Chattopadhyay S, Sengupta I (2018) Thermal-aware application mapping strategy for network-on-chip based system design. IEEE Trans Comput 67(4):528–542. https://doi.org/10.1109/TC.2017.2770130

    Article  MathSciNet  Google Scholar 

  39. Liu W, Yang L, Jiang W, Feng L, Guan N, Zhang W, Dutt N (2018) Thermal-aware task mapping on dynamically reconfigurable network-on-chip based multiprocessor system-on-chip. IEEE Trans Comput 67(12):1818–1834. https://doi.org/10.1109/TC.2018.2844365

    Article  MathSciNet  Google Scholar 

  40. Paul S, Chatterjee N, Ghosal P (2019) Dynamic task mapping and scheduling with temperature-awareness on network-on-chip based multicore systems. J Syst Archit 98:271–288. https://doi.org/10.1016/j.sysarc.2019.08.002

    Article  Google Scholar 

  41. Chatterjee N, Paul S, Chattopadhyay S (2017) Fault-tolerant dynamic task mapping and scheduling for network-on-chip-based multicore platform. ACM Trans Embed Comput Syst. https://doi.org/10.1145/3055512

    Article  Google Scholar 

  42. Abdel-Basset M, El-Shahat D, Deb K, Abouhawwash M (2020) Energy-awar2020e whale optimization algorithm for real-time task scheduling in multiprocessor systems. Appl Soft Comput 93:106349. https://doi.org/10.1016/j.asoc.2020.106349

    Article  Google Scholar 

  43. Huang C-H (2020) Hda: hierarchical and dependency-aware task mapping for network-on-chip based embedded systems. J Syst Archit 108:101740. https://doi.org/10.1016/j.sysarc.2020.101740

    Article  Google Scholar 

  44. Reza MF, McCloud Z (2023) Heuristics-enabled high-performance application mapping in network-on-chip based multicore systems. In: 2023 IEEE International Conference on Omni-layer Intelligent Systems (COINS), pp 1–6. https://doi.org/10.1109/COINS57856.2023.10189228

  45. Borkar S (2013) Exascale computing: a fact or a fiction? In: a Keynote at IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS). https://doi.org/10.1109/IPDPS.2013.121

  46. Matlab: Global Optimization Toolbox (2024). https://www.mathworks.com/products/global-optimization.html

  47. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680. https://doi.org/10.1126/science.220.4598.671 (https://www.science.org/doi/pdf/10.1126/science.220.4598.671)

    Article  MathSciNet  Google Scholar 

  48. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. A Bradford Book, Cambridge, MA, USA. ISBN 0262039249

    Google Scholar 

  49. Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor (second edition 1992)

    Google Scholar 

  50. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning, 1st edn. Addison-Wesley Longman Publishing Co. Inc, New York

    Google Scholar 

  51. Sastry K, Goldberg D, Kendall G (2005) In: Burke EK, Kendall G (eds) Genetic algorithms. Springer, Boston, pp 97–125. https://doi.org/10.1007/0-387-28356-0_4

  52. Grefenstette JJ, Baker JE (1989) How genetic algorithms work: a critical look at implicit parallelism. In: Proceedings of 3rd International Conference on Genetic Algorithms (ICGA)

  53. Baker JE (1987) Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the Second International Conference on Genetic Algorithms on Genetic Algorithms and Their Application, pp 14–21

  54. Syswerda G (1989) Uniform crossover in genetic algorithms

  55. Duato J, Yalamanchili S, Lionel N (2002) Interconnection networks: an engineering approach. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md Farhadur Reza.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Reza, M.F. High-performance application mapping in network-on-chip-based multicore systems. J Supercomput (2024). https://doi.org/10.1007/s11227-024-06184-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06184-9

Keywords

Navigation