International Journal of Parallel Programming

, Volume 44, Issue 6, pp 1296–1336 | Cite as

A Parallelization Approach for Hard Real-Time Systems and Its Application on Two Industrial Programs

Strategy and Two Case Studies for the Parallelization of Hard Real-Time Systems
  • Martin FriebEmail author
  • Ralf Jahr
  • Haluk Ozaktas
  • Andreas Hugl
  • Hans Regler
  • Theo Ungerer


Applications in industry often have grown and improved over many years. Since their performance demands increase, they also need to benefit from the availability of multi-core processors. However, a reimplementation from scratch and even a restructuring of these industrial applications is very expensive, often due to high certification efforts. Therefore, a strategy for a systematic parallelization of legacy code is needed. We present a parallelization approach for hard real-time systems, which ensures a high reusage of legacy code and preserves timing analysability. To show its applicability, we apply it on the core algorithm of an avionics application as well as on the control program of a large construction machine. We create models of the legacy programs showing the potential of parallelism, optimize them and change the source codes accordingly. The parallelized applications are placed on a predictable multi-core processor with up to 18 cores. For evaluation, we compare the worst case execution times and their speedups. Furthermore, we analyse limitations coming up at the parallelization process.


Parallelization Parallelization approach Model-based Parallel design patterns Algorithmic skeletons  Real-time Embedded Control code Case study 



The research leading to these results has received funding from the European Union Seventh Framework Programme under the name parMERASA and Grant Agreement No. 287519.


  1. 1.
    Abella, J., Hardy, D., Puaut, I., Quiñones, E., Cazorla, F.: On the comparison of deterministic and probabilistic WCET estimation techniques. In: 26th Euromicro Conference on Real-Time Systems (ECRTS), pp. 266–275 (2014). doi: 10.1109/ECRTS.2014.16
  2. 2.
    Altmeyer, S., Cucu-Grosjean, L., Davis, R.: Static probabilistic timing analysis for real-time systems using random replacement caches. Real Time Syst. 51(1), 77–123 (2015). doi: 10.1007/s11241-014-9218-4 CrossRefzbMATHGoogle Scholar
  3. 3.
    Amdahl, G.M.: Validity of the single processor approach to achieving large-scale computing capabilities. AFIPS Conference Proceedings, vol 30, pp. 483–485 (1967). doi: 10.1145/1465482.1465560
  4. 4.
    Audsley, N., Tindell, K., Burns, A.: The end of the line for static cyclic scheduling? In: Fifth Euromicro Workshop on Real-Time Systems. Proceedings, pp. 36–41 (1993). doi: 10.1109/EMWRT.1993.639042
  5. 5.
    Ballabriga, C., Cassé, H., Rochange, C., Sainrat, P.: OTAWA: an open toolbox for adaptive WCET analysis. In: Software Technologies for Embedded and Ubiquitous Systems (Lecture Notes in Computer Science), vol 6399, pp. 35–46. Springer, Berlin (2011). doi: 10.1007/978-3-642-16256-5_6
  6. 6.
    BAUER Maschinen GmbH: MC 128 foundation crane datasheet (2013).
  7. 7.
    Benoit, A., Cole, M., Gilmore, S., Hillston, J.: Flexible skeletal programming with eSkel. In: Euro-Par 2005 Parallel Processing (Lecture Notes in Computer Science), vol 3648, pp. 761–770. Springer, Berlin (2005). doi: 10.1007/11549468_83
  8. 8.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46(5), 720–748 (1999). doi: 10.1145/324133.324234 MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Bonenfant, A., Broster, I., Ballabriga, C., Bernat, G., Cassé, H., Houston, M., Merriam, N., de Michiel, M., Rochange, C., Sainrat, P.: Coding Guidelines for WCET Analysis Using Measurement-Based and Static Analysis Techniques. Technical Report IRIT/RR-2010-8-FR, IRIT-Institut de recherche en informatique de Toulouse (2010)Google Scholar
  10. 10.
    Cole, M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30(3), 389–406 (2004). doi: 10.1016/j.parco.2003.12.002 CrossRefGoogle Scholar
  11. 11.
    Cordes, D., Marwedel, P.: Multi-objective aware extraction of task-level parallelism using genetic algorithms. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012, pp. 394–399 (2012). doi: 10.1109/DATE.2012.6176503
  12. 12.
    Cordes, D., Marwedel, P., Mallik, A.: Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming. In: 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pp. 267–276 (2010). doi: 10.1145/1878961.1879009
  13. 13.
    Cucu-Grosjean, L., Santinelli, L., Houston, M., Lo, C., Vardanega, T., Kosmidis, L., Abella, J., Mezzetti, E., Quiñones, E., Cazorla, F.: Measurement-based probabilistic timing analysis for multi-path programs. In: 24th Euromicro Conference on Real-Time Systems (ECRTS), pp. 91–101 (2012). doi: 10.1109/ECRTS.2012.31
  14. 14.
    Falcou, J., Sérot, J., Chateau, T., Lapresté, J.T.: QUAFF: efficient C++ design for parallel skeletons. Parallel Comput. 32(7), 604–615 (2006). doi: 10.1016/j.parco.2006.06.001 CrossRefGoogle Scholar
  15. 15.
    Foster, I.: Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)zbMATHGoogle Scholar
  16. 16.
    Fraboulet, A., Risset, T., Scherrer, A.: Computer systems: architectures, modeling, and simulation: third and fourth international workshops, SAMOS 2004, Samos, Greece, July 21–23, 2004 and July 19–21, 2004. In: Proceedings, Chap. Cycle Accurate Simulation Model Generation for SoC Prototyping, pp. 453–462. Springer, Berlin (2004). doi: 10.1007/978-3-540-27776-7_47
  17. 17.
    Gebhard, G., Cullmann, C., Heckmann, R.: Software structure and WCET predictability. In: Bringing Theory to Practice: Predictability and Performance in Embedded Systems, vol 18, pp. 1–10. Dagstuhl, Germany (2011). doi: 10.4230/OASIcs.PPES.2011.1
  18. 18.
    Gerdes, M., Jahr, R., Ungerer, T.: parMERASA Pattern Catalogue: Timing Predictable Parallel Design Patterns. Technical Report 2013-11, University of Augsburg (2013)Google Scholar
  19. 19.
    Gerdes, M., Kluge, F., Ungerer, T., Rochange, C., Sainrat, P.: Time analysable synchronisation techniques for parallelised hard real-time applications. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012, pp. 671–676 (2012). doi: 10.1109/DATE.2012.6176555
  20. 20.
    Gerdes, M., Wolf, J., Guliashvili, I., Ungerer, T., Houston, M., Bernat, G., Schnitzler, S., Regler, H.: Large drilling machine control code—Parallelisation and WCET speedup. In: 6th IEEE International Symposium on Industrial Embedded Systems (SIES), pp. 91–94 (2011). doi: 10.1109/SIES.2011.5953688
  21. 21.
    González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Software: Practice and Experience 40(12), 1135–1160 (2010). doi: 10.1002/spe.1026
  22. 22.
    Gustavsson, A., Gustafsson, J., Lisper, B.: Toward static timing analysis of parallel software. In: 12th International Workshop on Worst-Case Execution Time Analysis, vol 23, pp. 38–47. Dagstuhl, Germany (2012). doi: 10.4230/OASIcs.WCET.2012.38
  23. 23.
    Herlihy, M.: Wait-free synchronization. ACM Trans. Progr. Lang. Syst.: TOPLAS 13(1), 124–149 (1991). doi: 10.1145/114005.102808 CrossRefGoogle Scholar
  24. 24.
    Herlihy, M., Luchangco, V., Moir, M.: Obstruction-free synchronization: double-ended queues as an example. In: Proceedings of the 23rd International Conference on Distributed Computing Systems, pp. 522–529. IEEE (2003). doi: 10.1109/ICDCS.2003.1203503
  25. 25.
    Infineon: AURIX—TC27x B-Step, 32-bit Single-Chip Micro-controller. User’s Manual, v14.1Google Scholar
  26. 26.
    Jahr, R., Frieb, M., Gerdes, M., Ungerer, T.: Model-based parallelization and optimization of an industrial control code. In: Dagstuhl-Workshop MBEES: Modellbasierte Entwicklung eingebetteter Systeme X, Schloss Dagstuhl, Germany, 2014, Tagungsband Modellbasierte Entwicklung eingebetteter Systeme, pp. 63–72. fortiss GmbH, München, Schloss Dagstuhl (2014).
  27. 27.
    Jahr, R., Gerdes, M., Ungerer, T.: On efficient and effective model-based parallelization of hard real-time applications. In: Dagstuhl-Workshop MBEES: Modellbasierte Entwicklung eingebetteter Systeme IX, Schloss Dagstuhl, Germany, 2013, Tagungsband Modellbasierte Entwicklung eingebetteter Systeme, pp. 50–59. Fortiss GmbH, München, Schloss Dagstuhl (2013).
  28. 28.
    Jahr, R., Gerdes, M., Ungerer, T.: A pattern-supported parallelization approach. In: Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM ’13, pp. 53–62. ACM, New York (2013). doi: 10.1145/2442992.2442998
  29. 29.
    Jahr, R., Gerdes, M., Ungerer, T., Ozaktas, H., Rochange, C., Zaykov, P.: Effects of structured parallelism by parallel design patterns on embedded hard real-time systems. In: IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pp. 1–10 (2014). doi: 10.1109/RTCSA.2014.6910546
  30. 30.
    Jahr, R., Stegmeier, A., Kiefhaber, R., Frieb, M., Ungerer, T.: User Manual for the Optimization and WCET Analysis of Software with Timing Analyzable Algorithmic Skeletons. Technical Report no. 2014-05, University of Augsburg (2014)Google Scholar
  31. 31.
    Kehr, S., Quiñones, E., Böddeker, B., Schäfer, G.: Parallel execution of AUTOSAR legacy applications on multicore ECUs with timed implicit communication. In: 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). San Francisco, USA (2015). doi: 10.1145/2744769.2744889
  32. 32.
    Kempf, S., Veldema, R., Philippsen, M.: Is there hope for automatic parallelization of legacy industry automation applications? In: Gesellschaft für Informatik e.V. (ed.) Proceedings of the 24th Workshop on Parallel Systems and Algorithms (PARS), pp. 80–89 (2011).
  33. 33.
    Keutzer, K., Massingill, B.L., Mattson, T.G., Sanders, B.A.: A design pattern language for engineering (parallel) software: merging the PLPP and OPL projects. In: Proceedings of the 2010 Workshop on Parallel Programming Patterns, ParaPLoP ’10, pp. 9:1–9:8. ACM, New York (2010). doi: 10.1145/1953611.1953620
  34. 34.
    Liu, X., Zhou, J., Zhang, D., Shen, Y., Guo, M.: A parallel skeleton library for embedded multicores. In: 39th International Conference on Parallel Processing Workshops (ICPPW), pp. 65–73. IEEE (2010). doi: 10.1109/ICPPW.2010.21
  35. 35.
    Lukas, R.G.: Dynamic compaction. Geotechnical Engineering Circular No. 1(FHWA-SA-95-037), 1–97 (1995).
  36. 36.
    Massingill, B.L., Mattson, T.G., Sanders, B.A.: Patterns for parallel application programs. In: Proceedings of the Sixth Pattern Languages of Programs Workshop (PLoP), Allerton Park in Monticello, IL (1999).
  37. 37.
    Mattson, T.G., Sanders, B.A., Massingill, B.L.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional, Boston, MA (2004)Google Scholar
  38. 38.
    Meade, A., Buckley, J., Collins, J.J.: Challenges of evolving sequential to parallel code: an exploratory review. In: Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th Annual ERCIM Workshop on Software Evolution, IWPSE-EVOL ’11, pp. 1–5. ACM, New York (2011). doi: 10.1145/2024445.2024447
  39. 39.
    Metzlaff, S., Mische, J., Ungerer, T.: A real-time capable many-core model. In: Proceedings of 32nd IEEE Real-Time Systems Symposium: Work-in-Progress Session, pp. 21–24. Vienna, Austria (2011).
  40. 40.
    OMG Unified Modeling Language™(OMG UML), Version 2.5. Standardization Document (2015).
  41. 41.
    Ozaktas, H., Rochange, C., Sainrat, P.: Automatic WCET analysis of real-time parallel applications. In: 13th International Workshop on Worst-Case Execution Time Analysis, vol 30, pp. 11–20. Dagstuhl, Germany (2013). doi: 10.4230/OASIcs.WCET.2013.11
  42. 42.
    Panić, M., Kehr, S., Quiñones, E., Böddecker, B., Abella, J., Cazorla, F.J.: RunPar: an allocation algorithm for automotive applications exploiting runnable parallelism in multicores. In: 2014 ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). New Delhi, India (2014). doi: 10.1145/2656075.2656096
  43. 43.
    Predictable parMERASA Multicore Processor. Deliverable 5.3 of the parMERASA Project (2013).
  44. 44.
    Puschner, P., Schoeberl, M.: On composable system timing, task timing, and WCET analysis. In: R. Kirner (ed.) 8th International Workshop on Worst-Case Execution Time Analysis (WCET’08) (OpenAccess Series in Informatics (OASIcs)), vol 8. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2008). doi: 10.4230/OASIcs.WCET.2008.1662. Also published in print by Austrian Computer Society (OCG) with ISBN 978-3-85403-237-3
  45. 45.
    Quinton, S., Bone, T.T., Hennig, J., Neukirchner, M., Negrean, M., Ernst, R.: Typical worst case response-time analysis and its use in automotive network design. In: Proceedings of the 51st Annual Design Automation Conference, DAC ’14, pp. 44:1–44:6. ACM, New York (2014). doi: 10.1145/2593069.2602977
  46. 46.
    Quinton, S., Hanke, M., Ernst, R.: Formal analysis of sporadic overload in real-time systems. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 515–520 (2012). doi: 10.1109/DATE.2012.6176523
  47. 47.
    Rapita Systems: RapiTime explained. White Paper MC-WP-001-17,
  48. 48.
    Report on support of tools for case studies. Deliverable 3.12 of the parMERASA Project (2014).
  49. 49.
    Rochange, C., Bonenfant, A., Sainrat, P., Gerdes, M., Wolf, J., Ungerer, T., Petrov, Z., Mikulu, F.: WCET analysis of a parallel 3D multigrid solver executed on the MERASA multi-core. In: 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010), vol 15, pp. 90–100. Dagstuhl, Germany (2010). doi: 10.4230/OASIcs.WCET.2010.90
  50. 50.
    Saifullah, A., Agrawal, K., Lu, C., Gill, C.: Multi-core real-time scheduling for generalized parallel task models. In: IEEE 32nd Real-Time Systems Symposium (RTSS), pp. 217–226 (2011). doi: 10.1109/RTSS.2011.27
  51. 51.
    Schlingmann, S., Garbade, A., Weis, S., Ungerer, T.: Connectivity-sensitive algorithm for task placement on a many-core considering faulty regions. In: 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 417–422 (2011). doi: 10.1109/PDP.2011.58
  52. 52.
    Sensortechnik Wiedemann GmbH: ESX-3XL. Data Sheet. (2014)
  53. 53.
    Sérot, J., Ginhac, D.: Skeletons for parallel image processing: an overview of the SKIPPER project. Parallel Comput. 28(12), 1685–1708 (2002). doi: 10.1016/S0167-8191(02)00189-8 CrossRefzbMATHGoogle Scholar
  54. 54.
    Stegmeier, A., Frieb, M., Jahr, R., Ungerer, T.: Algorithmic skeletons for parallelization of embedded real-time systems. In: 3rd Workshop on High-Performance and Real-time Embedded Systems (HiRES) (2015).
  55. 55.
    Ungerer, T., Bradatsch, C., Frieb, M., Kluge, F., Mische, J., Stegmeier, A., Jahr, R., Gerdes, M., Zaykov, P., Matusova, L., Li, Z.J.J., Petrov, Z., Böddeker, B., Kehr, S., Regler, H., Hugl, A., Rochange, C., Ozaktas, H., Cassé, H., Bonenfant, A., Sainrat, P., Lay, N., George, D., Broster, I., Quiñones, E., Panić, M., Abella, J., Hernandez, C., Cazorla, F., Uhrig, S., Rohde, M., Pyka, A.: Parallelizing industrial hard real-time applications for the parMERASA multi-core. Trans. Embed. Comput. Syst.: TECS (2016) (To appear)Google Scholar
  56. 56.
    Ungerer, T., Bradatsch, C., Gerdes, M., Kluge, F., Jahr, R., Mische, J., Fernandes, J., Zaykov, P., Petrov, Z., Böddeker, B., Kehr, S., Regler, H., Hugl, A., Rochange, C., Ozaktas, H., Casse, H., Bonenfant, A., Sainrat, P., Broster, I., Lay, N., George, D., Quiñones, E., Panić, M., Abella, J., Cazorla, F., Uhrig, S., Rohde, M., Pyka, A.: parMERASA—multi-core execution of parallelised hard real-time applications supporting analysability. In: 2013 Euromicro Conference on Digital System Design (DSD), pp. 363–370 (2013). doi: 10.1109/DSD.2013.46
  57. 57.
    Ungerer, T., Bradatsch, C., Gerdes, M., Kluge, F., Jahr, R., Mische, J., Stegmeier, A., Frieb, M., Fernandes, J., Zaykov, P., Petrov, Z., Böddeker, B., Kehr, S., Regler, H., Hugl, A., Rochange, C., Ozaktas, H., Casse, H., Bonenfant, A., Sainrat, P., Broster, I., Lay, N., George, D., Quiñones, E., Panić, M., Abella, J., Cazorla, F., Uhrig, S., Rohde, M., Pyka, A.: Experiences and results of parallelisation of industrial hard real-time applications for the parMERASA multi-core. In: 3rd Workshop on High-Performance and Real-Time Embedded Systems (HiRES) (2015).
  58. 58.
    Wang, Z., O’Boyle, M.F.: Mapping parallelism to multi-cores: a machine learning based approach. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’09, pp. 75–84. ACM, New York (2009). doi: 10.1145/1504176.1504189
  59. 59.
    Wilhelm, R., Engblom, J., Aandreas, E., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenström, P.: The worst-case execution time problem-overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 36:1–36:53 (2008). doi: 10.1145/1347375.1347389 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Martin Frieb
    • 1
    Email author
  • Ralf Jahr
    • 1
  • Haluk Ozaktas
    • 2
  • Andreas Hugl
    • 3
  • Hans Regler
    • 3
  • Theo Ungerer
    • 1
  1. 1.Department of Computer ScienceUniversity of AugsburgAugsburgGermany
  2. 2.Université Toulouse III - Paul SabatierToulouseFrance
  3. 3.BAUER Maschinen GmbHSchrobenhausenGermany

Personalised recommendations