Advertisement

Hard real-time application mapping reconfiguration for NoC-based many-core systems

  • Behnaz PourmohseniEmail author
  • Stefan Wildermann
  • Michael Glaß
  • Jürgen Teich
Article
  • 47 Downloads

Abstract

Real-time applications are increasingly targeting many-core platforms, demanding predictability in a highly dynamic environment. To enable this shift, for each application, a set of mapping candidates with diverse resource requirements and performance qualities (latency, energy, etc.) may be computed at design time, and subsequently, exploited at run time to launch the application on a mapping that adheres to the on-line quality and resource constraints. These constraints, however, may also change during execution such that the mapping in use fails to satisfy them, necessitating a switch to another mapping. This process, namely, mapping reconfiguration, involves the migration of several tasks and may harm timing predictability if the reconfiguration overhead is not accounted for. This paper presents a deterministic mapping reconfiguration methodology to enable predictable reconfigurations among a given set of mappings. To this end, first in an off-line analysis, we (a) identify low-latency migration routes with minimal allocation overhead for each pair of source/target mappings and (b) bound the worst-case reconfiguration latency using an off-line timing analysis. This information is then used at run time to perform timely reconfigurations. We further investigate a (c) hybrid timing analysis which regards the actual availability of communication resources at run time to derive tighter latency bounds. Experimental results for a variety of applications show that the proposed methodology enables reconfigurations with low allocation overhead and affordable latency. To demonstrates the practicality of the proposed methodology and the advantages of the hybrid latency analysis over its off-line counterpart, we present a case study on thermal management of many-core systems using mapping reconfiguration.

Keywords

Mapping reconfiguration Real-time task migration Run-time predictability Worst-case timing analysis Hybrid timing analysis 

Notes

Acknowledgements

We would like to thank the anonymous reviewers for their valuable feedback. This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Center “Invasive Computing” (SFB/TR 89).

References

  1. Abdallah L, Jan M, Ermont J, Fraboul C (2015) Wormhole networks properties and their use for optimizing worst case delay analysis of many-cores. In: International symposium on industrial embedded systems, IEEE, pp 1–10Google Scholar
  2. Acquaviva A, Alimonda A, Carta S, Pittau M (2008) Assessing task migration impact on embedded soft real-time streaming multimedia applications. EURASIP J Embed Syst 9:1–15CrossRefGoogle Scholar
  3. Akesson B, Molnos A, Hansson A, Angelo JA, Goossens K (2011) Composability and predictability for independent application development, verification, and execution. In: Multiprocessor system-on-chip, Springer, pp 25–56Google Scholar
  4. Bekooij M, Hoes R, Moreira O, Poplavko P, Pastrnak M, Mesman B, Mol JD, Stuijk S, Gheorghita V, Van Meerbergen J (2005) Dataflow analysis for real-time embedded multiprocessor system design. In: Dynamic and robust streaming in and between connected consumer-electronic devices, Springer, pp 81–108Google Scholar
  5. Benini L, De Micheli G (2002) Networks on chip: a new paradigm for systems on chip design. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 418–419Google Scholar
  6. Bertozzi S, Acquaviva A, Bertozzi D, Poggiali A (2006) Supporting task migration in multi-processor systems-on-chip: a feasibility study. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 15–20Google Scholar
  7. Bjerregaard T, Sparso J (2006) Implementation of guaranteed services in the MANGO clockless network-on-chip. IEE Proc Comput Digit Tech 153(4):217–229CrossRefGoogle Scholar
  8. Borkar S (2007) Thousand core chips: a technology perspective. In: Proceedings of the design automation conference (DAC), ACM, pp 746–749Google Scholar
  9. Brião EW, Barcelos D, Wronski F, Wagner FR (2007) Impact of task migration in NoC-based MPSoCs for soft real-time applications. In: IFIP international conference on very large scale integration (VLSI-SoC), IEEE, pp 296–299Google Scholar
  10. Chang HWD, Oldham WJ (1995) Dynamic task allocation models for large distributed computing systems. IEEE Trans Parallel Distrib Syst (TPDS) 6(12):1301–1315CrossRefGoogle Scholar
  11. Dally WJ (1992) Virtual-channel flow control. IEEE Trans Parallel Distrib Syst 3(2):194–205CrossRefGoogle Scholar
  12. Dick R (2010) Embedded system synthesis benchmarks suite (E3S). http://ziyang.eecs.umich.edu/~dickrp/e3sdd/
  13. Dijkstra EW (1959) A note on two problems in connexion with graphs. Numer Math 1(1):269–271MathSciNetCrossRefzbMATHGoogle Scholar
  14. Dziurzanski P, Singh AK, Indrusiak LS (2017) Multi-criteria resource allocation in modal hard real-time systems. EURASIP J Embed Syst 1:30CrossRefGoogle Scholar
  15. El-Antably A, Gruber O, Rousseau F, Fournel N (2015) Transparent and portable agent based task migration for data-flow applications on multi-tiled architectures. In: Proceedings of the international conference on hardware/software codesign and system synthesis (CODES+ISSS), pp 183–192Google Scholar
  16. Fu F, Wang L, Lu Y, Wang J (2013) Low overhead task migration mechanism in NoC-based MPSoC. In: International conference on ASIC, IEEE, pp 1–4Google Scholar
  17. Gantel L, Layouni S, Benkhelifa MEA, Verdier F, Chauvet S (2009) Multiprocessor task migration implementation in a reconfigurable platform. In: International conference on reconfigurable computing and FPGAs (ReConFig), IEEE, pp 362–367Google Scholar
  18. Goossens K, Dielissen J, Radulescu A (2005) Æthereal network on chip: concepts, architectures, and implementations. IEEE Des Test Comput 22(5):414–421CrossRefGoogle Scholar
  19. Hansson A, Goossens K, Bekooij M, Huisken J (2009a) CoMPSoC: a template for composable and predictable multi-processor system on chips. ACM Trans Des Autom Electron Syst (TODAES) 14(1):2Google Scholar
  20. Hansson A, Subburaman M, Goossens K (2009b) Ælite: a flit-synchronous network on chip with composable and predictable services. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 250–255Google Scholar
  21. Heisswolf J, König R, Kupper M, Becker J (2013) Providing multiple hard latency and throughput guarantees for packet switching networks on chip. Comput Electr Eng 39(8):2603–2622CrossRefGoogle Scholar
  22. Hesham S, Rettkowski J, Goehringer D, El Ghany MAA (2017) Survey on real-time networks-on-chip. IEEE Trans Parallel Distrib Syst 28(5):1500–1517CrossRefGoogle Scholar
  23. Hilbrich R, Van Kampenhout JR (2011) Partitioning and task transfer on NoC-based many-core processors in the avionics domain. J Softwaretechnik-Trends 30(3):6Google Scholar
  24. Holmbacka S, Lund W, Lafond S, Lilius J (2013) Task migration for dynamic power and performance characteristics on many-core distributed operating systems. In: Euromicro international conference on parallel, distributed and network-based processing (PDP), IEEE, pp 310–317Google Scholar
  25. Hu J, Marculescu R (2003) Energy-aware mapping for tile-based NoC architectures under performance constraints. In: Proceedings of the Asia and South Pacific design automation conference (ASP-DAC), ACM, pp 233–239Google Scholar
  26. Katre KM, Ramaprasad H, Sarkar A, Mueller F (2009) Policies for migration of real-time tasks in embedded multi-core systems. In: Real-time systems symposium (RTSS), Southern Illinois University at Carbondale, pp 17–20Google Scholar
  27. Kopetz H (2011) Real-time systems: design principles for distributed embedded applications, 2nd edn. Springer, New YorkCrossRefzbMATHGoogle Scholar
  28. Liu Z, Tan SXD, Huang X, Wang H (2015) Task migrations for distributed thermal management considering transient effects. IEEE Trans Very Large Scale Integr (VLSI) Syst 23(2):397–401CrossRefGoogle Scholar
  29. Lukasiewycz M, Glaß M, Reimann F, Teich J (2011) Opt4J: a modular framework for meta-heuristic optimization. In: Proceedings of the conference on genetic and evolutionary computation (GECCO), ACM, pp 1723–1730Google Scholar
  30. Madalozzo G, Duenha L, Azevedo R, Moraes FG (2016) Scalability evaluation in many-core systems due to the memory organization. In: International conference on electronics, circuits and systems (ICECS), IEEE, pp 396–399Google Scholar
  31. Megel T, Jan M, David V, Fraboul C (2011) Evaluation of task migration mechanisms for hard real-time distributed systems. In: RTNS, pp 159–168Google Scholar
  32. Millberg M, Nilsson E, Thid R, Jantsch A (2004) Guaranteed bandwidth using looped containers in temporally disjoint networks within the Nostrum network on chip. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, vol 2, pp 890–895Google Scholar
  33. Milojicic DS, Douglis F, Paindaveine Y, Wheeler R, Zhou S (2000) Process migration. ACM Comput Surv (CSUR) 32(3):241–299CrossRefGoogle Scholar
  34. Mitra T, Teich J, Thiele L (2018) Time-critical systems design: a survey. IEEE Des Test 35(2):8–26CrossRefGoogle Scholar
  35. Munk P, Saballus B, Richling J, Heiss HU (2015) Position paper: real-time task migration on many-core processors. In: Proceedings of the international conference on architecture of computing systems (ARCS), VDE, pp 1–4Google Scholar
  36. Ngo TD, Martin KJ, Diguet JP (2017) Move based algorithm for runtime mapping of dataflow actors on heterogeneous MPSoCs. J Signal Process Syst 87(1):63–80CrossRefGoogle Scholar
  37. Ni LM, McKinley PK (1993) A survey of wormhole routing techniques in direct networks. Computer 2:62–76CrossRefGoogle Scholar
  38. Nikolić B, Petters SM (2014) EDF as an arbitration policy for wormhole-switched priority-preemptive NoCs–myth or fact? In: International conference on embedded software (EMSOFT), IEEE, pp 1–10Google Scholar
  39. Nollet V, Marescaux T, Avasare P, Verkest D, Mignolet JY (2005) Centralized run-time resource management in a network-on-chip containing reconfigurable hardware tiles. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 234–239Google Scholar
  40. Pittau M, Alimonda A, Carta S, Acquaviva A (2007) Impact of task migration on streaming multimedia for embedded multiprocessors: a quantitative evaluation. In: Workshop on embedded systems for real-time multimedia (ESTIMedia), IEEE, pp 59–64Google Scholar
  41. Pourmohseni B, Glaß M, Teich J (2017a) Automatic operating point distillation for hybrid mapping methodologies. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 1135–1140Google Scholar
  42. Pourmohseni B, Wildermann S, Glaß M, Teich J (2017b) Predictable run-time mapping reconfiguration for real-time applications on many-core systems. In: Proceedings of the international conference on real-time networks and systems (RTNS), ACM, pp 148–157Google Scholar
  43. Saraswat PK, Pop P, Madsen J (2009) Task migration for fault-tolerance in mixed-criticality embedded systems. ACM SIGBED Rev 6(3):6CrossRefGoogle Scholar
  44. Sarkar A, Mueller F, Ramaprasad H (2011) Predictable task migration for locked caches in multi-core systems. ACM SIGPLAN Not 46(5):131–140CrossRefGoogle Scholar
  45. Schmitz MT, Al-Hashimi BM, Eles P (2003) A co-design methodology for energy-efficient multi-mode embedded systems with consideration of mode execution probabilities. In: Proceedings of the design, automation and test in Europe conference and exhibition (DATE), IEEE, pp 960–965Google Scholar
  46. Schoeberl M, Abbaspour S, Akesson B, Audsley N, Capasso R, Garside J, Goossens K, Goossens S, Hansen S, Heckmann R (2015) T-CREST: time-predictable multi-core architecture for embedded systems. J Syst Archit 61(9):449–471CrossRefGoogle Scholar
  47. Shi Z, Burns A (2008) Real-time communication analysis for on-chip networks with wormhole switching. In: ACM/IEEE international symposium on networks-on-chip, IEEE, pp 161–170Google Scholar
  48. Singh AK, Shafique M, Kumar A, Henkel J (2013) Mapping on multi/many-core systems: survey of current and emerging trends. In: Proceedings of the design automation conference (DAC), ACM, pp 1–10Google Scholar
  49. Singh AK, Dziurzanski P, Mendis HR, Soares Indrusiak L (2017) A survey and comparative study of hard and soft real-time dynamic resource allocation strategies for multi/many-core systems. ACM Comput Surv (CSUR) 50:24Google Scholar
  50. Smith JM (1988) A survey of process migration mechanisms. ACM SIGOPS Oper Syst Rev 22(3):28–40MathSciNetCrossRefGoogle Scholar
  51. Smith P, Hutchinson NC (1998) Heterogeneous process migration: the Tui system. Softw-Pract Exp 28(6):611–640CrossRefGoogle Scholar
  52. Stefan RA, Molnos A, Goossens K (2014) dÆlite: a TDM NoC supporting QoS, multicast, and fast connection set-up. IEEE Trans Comput 63(3):583–594MathSciNetCrossRefzbMATHGoogle Scholar
  53. Tendulkar P, Stuijk S (2013) A case study into predictable and composable MPSoC reconfiguration. In: International parallel and distributed processing symposium workshops & PhD Forum (IPDPSW), IEEE, pp 293–300Google Scholar
  54. Venkataraman S, Santos R, Kumar A, Kuijsten J (2015) Hardware task migration module for improved fault tolerance and predictability. In: Proceedings of the international conference on embedded computer systems: architectures, modeling, and simulation (SAMOS), IEEE, pp 197–202Google Scholar
  55. Wallentowitz S, Rosch S, Wild T, Herkersdorf A, Wenzel V, Henkel J (2014) Dependable task and communication migration in tiled manycore system-on-chip. In: Forum on specification and design languages (FDL), IEEE, vol 978, pp 1–8Google Scholar
  56. Weichslgartner A, Gangadharan D, Wildermann S, Glaß M, Teich J (2014) DAARM: design-time application analysis and run-time mapping for predictable execution in many-core systems. In: Proceedings of the international conference on hardware/software codesign and system synthesis (CODES+ISSS), IEEE, pp 1–10Google Scholar
  57. Weichslgartner A, Wildermann S, Glaß M, Teich J (2018) Invasive computing for mapping parallel programs to many-core architectures. Springer, New YorkCrossRefGoogle Scholar
  58. Wiggers M, Bekooij M, Jansen P, Smit G (2006) Efficient computation of buffer capacities for multi-rate real-time systems with back-pressure. In: Proceedings of the international conference on hardware/software codesign and system synthesis (CODES+ISSS), ACM, pp 10–15Google Scholar
  59. Wildermann S, Reimann F, Ziener D, Teich J (2011) Symbolic design space exploration for multi-mode reconfigurable systems. In: Proceedings of the international conference on hardware/software codesign and system synthesis (CODES+ISSS), ACM, pp 129–138Google Scholar
  60. Zeng G, Matsubara Y, Tomiyama H, Takada H (2014) Task migration for energy saving in real-time multiprocessor systems. In: International conference on high performance computing and communications (HPCC), international symposium on cyberspace safety and security (CSS), and international conference on embedded software and systems (ICESS), IEEE, pp 685–692Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)ErlangenGermany
  2. 2.Ulm UniversityUlmGermany

Personalised recommendations