Software Compilation Techniques for MPSoCs

  • Rainer Leupers
  • Weihua Sheng
  • Jeronimo Castrillon


The increasing demands such as high-performance and energy-efficiency for future embedded systems result in the emerging of heterogeneous Multiprocessor System-on-Chip (MPSoC) architectures. To fully enable the power of those architectures, new tools are needed to take care of the increasing complexity of the software to achieve high productivity. An MPSoC compiler is the tool-chain to tackle the problems of expressing parallelism in applications‘ modeling/programming, mapping/scheduling and generating the software to distribute on an MPSoC platform for efficient usage, for a given (pre-)verified MPSoC platform. This chapter talks about the various aspects of MPSoC compilers for heterogeneous MPSoC architectures, using a comparison to the well-established uni-processor C compiler technology. After a brief introduction to MPSoC and MPSoC compilers, the important ingredients of the compilation process, such as programming models, granularity and partitioning, platformdescription, mapping/scheduling and code-generation, are explained in detail. As the topic is relatively young, a number of case studies from academia and industry are selected to illustrate the concepts at the end of this chapter.


Programming Model Basic Block Code Block Virtual Platform Instruction Level Parallelism 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Eclipse. Visited on Jan. 2010
  2. 2.
    GDB: The GNU Project Debugger. Visited on Jan. 2010
  3. 3.
    MCAPI - Multicore Communications API. Visited on Nov. 2009
  4. 4.
    Multiprocessor Systems-on-Chips, chap. Chapter 12. ILP-based Resource-aware Compilation, pp. 337–354Google Scholar
  5. 5.
    PISA - A Platform and Programming Language Independent Interface for Search Algorithms. Visited on Nov. 2009
  6. 6.
    Real Time Software Components. Visited on Jan. 2010
  7. 7.
    AbsInt: aiT Worst-Case Execution Time Analyzers. Visited on Nov. 2009
  8. 8.
    ACE: Embedded C for High Performance DSP Programming with the CoSy Compiler Development System. Visited on Jan. 2010Google Scholar
  9. 9.
    Adl-Tabatabai, A.R., Kozyrakis, C., Saha, B.: Unlocking Concurrency. Queue 4(10), 24–33 (2007)CrossRefGoogle Scholar
  10. 10.
    Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (1986)Google Scholar
  11. 11.
    Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The Landscape of Parallel Computing Research: A View from Berkeley. Tech. rep., EECS Department, University of California, Berkeley (2006)Google Scholar
  12. 12.
    Benini, L., Bertozzi, D., Guerri, A., Milano, M.: Allocation and Scheduling for MPSoCs via Decomposition and No-good Generation. Principles and Practices of Constrained Programming - CP 2005 (DEIS-LIA-05-001), 107–121 (2005)CrossRefGoogle Scholar
  13. 13.
    Bhattacharya, B., Bhattacharyya, S.S.: Parameterized dataflow modeling for DSP systems. IEEE Transactions on Signal Processing 49(10), 2408–2421 (2001)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Castrillon, J., Zhang, D., Kempf, T., Vanthournout, B., Leupers, R., Ascheid, G.: Task Management in MPSoCs: An ASIP Approach. In: ICCAD 2009 (2009)Google Scholar
  15. 15.
    Ceng, J., Castrillon, J., Sheng, W., Scharwächter, H., Leupers, R., Ascheid, G., Meyr, H., Isshiki, T., Kunieda, H.:MAPS: an integrated framework forMPSoC application parallelization. In: DAC ’08: Proceedings of the 45th annual conference on Design automation, pp. 754–759. ACM, New York, NY, USA (2008)Google Scholar
  16. 16.
    Ceng, J., Sheng, W., Castrillon, J., Stulova, A., Leupers, R., Ascheid, G., Meyr, H.: A highlevel virtual platform for early MPSoC software development. In: CODES+ISSS ’09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pp. 11–20. ACM, New York, NY, USA (2009)Google Scholar
  17. 17.
    Cesario, W., Jerraya., A.: Multiprocessor Systems-on-Chips, chap. Chapter 9. Component-Based Design for Multiprocessor Systems-on-Chip, pp. 357–394. Morgan Kaufmann (2005)Google Scholar
  18. 18.
    Collette, T.: Key Technologies for Many Core Architectures. In: 8th International Forum on Application-Specific Multi-Processor SoC (2008)Google Scholar
  19. 19.
    CoWare: CoWare Virtual Platforms. Visited on Apr. 2009
  20. 20.
    Fisher, J., P., F., Young, C.: Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan-Kaufmann (Elsevier) (2005)Google Scholar
  21. 21.
    Flake, P., Davidmann, S., Schirrmeister, F.: System-level exploration tools forMPSoC designs. In: Design Automation Conference, 2006 43rd ACM/IEEE, pp. 286–287 (2006)Google Scholar
  22. 22.
    Gao, L., Huang, J., Ceng, J., Leupers, R., Ascheid, G.,Meyr, H.: TotalProf: a fast and accurate retargetable source code profiler. In: CODES+ISSS ’09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pp. 305–314. ACM, New York, NY, USA (2009)CrossRefGoogle Scholar
  23. 23.
    Georgios Tournavitis Zheng Wang, B.F., O’Boyle, M.: Towards a Holistic Approach to Auto-Parallelization - Integrating Profile-Driven Parallelism Detection andMachine-Learning Based Mapping. In: PLDI 0-9: Proceedings of the Programming Language Design and Implementation Conference. Dublin, Ireland, June 15 - 20 (2009)Google Scholar
  24. 24.
    Gheorghita, S., T. Basten, H.C.: An Overview of Application Scenario Usage in Streaming-Oriented Embedded System DesignGoogle Scholar
  25. 25.
    Grant Martin: ESL Requirements for Configurable Processor-based Embedded System Design. Design and ReuseGoogle Scholar
  26. 26.
    Gupta, R., Micheli, G.D.: Hardware-software Co-synthesis for Digital Systems. In: IEEE Design & Test of Computers, pp. 29–41 (1993)Google Scholar
  27. 27.
    Hankins, R.A., et al.: Multiple Instruction Stream Processor. SIGARCH Comp. Arch. News 34(2) (2006)Google Scholar
  28. 28.
    Hansson, A., Goossens, K., Bekooij, M., Huisken, J.: CoMPSoC: A template for composable and predictable multi-processor system on chips. ACM Trans. Des. Autom. Electron. Syst. 14(1), 1–24 (2009)CrossRefGoogle Scholar
  29. 29.
    Hewitt, C., Bishop, P., Greif, I., Smith, B., Matson, T., Steiger, R.: Actor induction and metaevaluation. In: POPL ’73: Proceedings of the 1st annual ACMSIGACT-SIGPLAN symposium on Principles of programming languages, pp. 153–168. ACM, New York, NY, USA (1973)CrossRefGoogle Scholar
  30. 30.
    Hind, M.: Pointer Analysis: Haven’t we Solved this Problem Yet? In: PASTE ’01, pp. 54–61. ACM Press (2001)Google Scholar
  31. 31.
    Hu, T.C.: Parallel Sequencing and Assembly Line Problems. Operations Research 9(6), 841–848 (1961). URL
  32. 32.
    Hwang, Y., Abdi, S., Gajski, D.: Cycle-approximate Retargetable Performance Estimation at the Transaction Level. In: DATE ’08: Proceedings of the conference on Design, automation and test in Europe, pp. 3–8. ACM, New York, NY, USA (2008)CrossRefGoogle Scholar
  33. 33.
    Hwu, W.M., Ryoo, S., Ueng, S.Z., Kelm, J.H., Gelado, I., Stone, S.S., Kidd, R.E., Baghsorkhi, S.S.,Mahesri, A.A., Tsao, S.C., Navarro, N., Lumetta, S.S., Frank, M.I., Patel, S.J.: Implicitly Parallel Programming Models for Thousand-core Microprocessors. In: DAC ’07: Proceedings of the 44th annual conference on Design automation, pp. 754–759. ACM, New York, NY, USA (2007)CrossRefGoogle Scholar
  34. 34.
    Kahn, G.: The Semantics of a Simple Language for Parallel Programming. In: J.L. Rosenfeld (ed.) Information Processing ’74: Proceedings of the IFIP Congress, pp. 471–475. North-Holland, New York, NY (1974)Google Scholar
  35. 35.
    Kandemir, M., Dutt, N.:Multiprocessor Systems-on-Chips, chap. Chapter 9. Memory Systems and Compiler Support for MPSoC Architectures, pp. 251–281. Morgan Kaufmann (2005)Google Scholar
  36. 36.
    Karp, R.M., Miller, R.E.: Properties of a model for parallel computations: Determinacy, termination, queuing. SIAM Journal of Applied Math 14(6) (1966)MATHMathSciNetGoogle Scholar
  37. 37.
    Karuri, K., Al Faruque, M.A., Kraemer, S., Leupers, R., Ascheid, G., Meyr, H.: Fine-grained Application Source Code Profiling for ASIP Design. In: DAC ’05: Proceedings of the 42nd annual conference on Design automation, pp. 329–334. ACM, New York, NY, USA (2005)CrossRefGoogle Scholar
  38. 38.
    Kennedy, K., Allen, J.R.: Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002)Google Scholar
  39. 39.
    Kloss, N.: Application Programming Strategies for TI’s OMAP Solutions. Embedded Edge (2003)Google Scholar
  40. 40.
    Krishnan, V., Torrellas, J.: A Chip-Multiprocessor Architecture with Speculative Multithreading. IEEE Trans. Comput. 48(9), 866–880 (1999)CrossRefGoogle Scholar
  41. 41.
    Kumar, S., Hughes, C.J., Nguyen, A.: Carbon: Architectural Support for Fine-Grained Parallelism on Chip Multiprocessors. SIGARCH Comp. Arch. News 35(2) (2007)MATHGoogle Scholar
  42. 42.
    Kung, H.T.: Why Systolic Architectures? Computer 15(1), 37–46 (1982)CrossRefGoogle Scholar
  43. 43.
    Kwok, Y.K., Ahmad, I.: Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)CrossRefGoogle Scholar
  44. 44.
    Kwon, S., Kim, Y., Jeun, W.C., Ha, S., Paek, Y.: A retargetable parallel-programming framework for MPSoC. ACM Trans. Des. Autom. Electron. Syst. 13(3), 1–18 (2008)CrossRefGoogle Scholar
  45. 45.
    Lam, M.: Software pipelining: an effective scheduling technique for VLIW machines. SIGPLAN Not. 23(7), 318–328 (1988)CrossRefGoogle Scholar
  46. 46.
    Lee, E., Messerschmitt, D.: Synchronous data flow. Proceedings of the IEEE 75(9), 1235–1245 (1987)CrossRefGoogle Scholar
  47. 47.
    Lee, E.A.: Consistency in Dataflow Graphs. IEEE Trans. Parallel Distrib. Syst. 2(2), 223–235 (1991)CrossRefGoogle Scholar
  48. 48.
    Lee, E.A.: The Problem with Threads. Computer 39(5), 33–42 (2006). URL Google Scholar
  49. 49.
    Leupers, R.: Retargetable Code Generation for Digital Signal Processors. Kluwer Academic Publishers, Norwell, MA, USA (1997)MATHGoogle Scholar
  50. 50.
    Leupers, R.: Code Selection for Media Processors with SIMD Instructions. In: DATE ’00, pp. 4–8. ACM (2000)Google Scholar
  51. 51.
    Li, L., Huang, B., Dai, J., Harrison, L.: Automatic multithreading and multiprocessing of C programs for IXP. In: PPoPP ’05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 132–141. ACM, New York, NY, USA (2005)CrossRefGoogle Scholar
  52. 52.
    Ma, Z., Marchal, P., Scarpazza, D.P., Yang, P., Wong, C., Gmez, J.I., Himpe, S., Ykman-Couvreur, C., Catthoor, F.: Systematic Methodology for Real-Time Cost-Effective Mapping of Dynamic Concurrent Task-Based Systems on Heterogenous Platforms. Springer Publishing Company, Incorporated (2007)CrossRefGoogle Scholar
  53. 53.
    Mignolet, J.Y., Baert, R., Ashby, T.J., Avasare, P., Jang, H.O., Son, J.C.: MPA: Parallelizing an Application onto a Multicore Platform Made Easy. IEEE Micro 29(3), 31–39 (2009)CrossRefGoogle Scholar
  54. 54.
    Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1997)Google Scholar
  55. 55.
    National Instruments: LabView. Visited on Mar. 2009
  56. 56.
    Nieuwland, A., Kang, J., Gangwal, O.P., Sethuraman, R., Busa, R.S.C.N., Goossens, K., Llopis, R.P.: C-HEAP: A Heterogeneous Multi-processor Architecture Template and Scalable and Flexible Protocol for the Design of Embedded Signal Processing Systems. Design Automation for Embedded Systems (7), 233–270 (2002)MATHCrossRefGoogle Scholar
  57. 57.
    Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and Automated Multiprocessor System Design, Programming, and Implementation. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 27(3), 542–555 (2008)CrossRefGoogle Scholar
  58. 58.
    Nikolov, H., Thompson, M., Stefanov, T., Pimentel, A., Polstra, S., Bose, R., Zissulescu, C., Deprettere, E.: Daedalus: Toward Composable Multimedia MP-SoC Design. In: DAC ’08: Proceedings of the 45th annual conference on Design automation, pp. 574–579. ACM, New York, NY, USA (2008)CrossRefGoogle Scholar
  59. 59.
    The OpenMP specification for parallel programming: Visited on Nov. 2009
  60. 60.
    Paolucci, P.S., Jerraya, A.A., Leupers, R., Thiele, L., Vicini, P.: SHAPES:: a tiled scalable software hardware architecture platform for embedded systems. In: CODES+ISSS ’06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, pp. 167–172. ACM, New York, NY, USA (2006)CrossRefGoogle Scholar
  61. 61.
    Park, S., sun Hong, D., Chae, S.I.: A Hardware Operating System Kernel for Multi-Processor Systems. IEICE 5(9) (2008)Google Scholar
  62. 62.
    Parks, T.M.: Bounded scheduling of process networks. Ph.D. thesis, Berkeley, CA, USA (1995)Google Scholar
  63. 63.
    Pimentel, A.D., Erbas, C., Polstra, S.: A Systematic Approach to Exploring Embedded System Architectures at Multiple Abstraction Levels. IEEE Transactions on Computers 55(2), 99–112 (2006)CrossRefGoogle Scholar
  64. 64.
    Sharma, G., Martin, J.: MATLAB˝o: A Language for Parallel Computing. International Journal of Parallel Programming 37(1) (2009)MATHCrossRefGoogle Scholar
  65. 65.
    Snir, M., Otto, S.: MPI-The Complete Reference: The MPI Core. MIT Press (1998)Google Scholar
  66. 66.
    Sporer, T., Franck, A., Bacivarov, I., Beckinger, M., Haid, W., Huang, K., Thiele, L., Paolucci, P., Bazzana, P., Vicini, P., Ceng, J., Kraemer, S., Leupers, R.: SHAPES - a Scalable Parallel HW/SW Architecture Applied to Wave Field Synthesis. In: Proc. 32nd Intl Audio Engineering Society (AES) Conference, pp. 175–187. Audio Engineering Society, Hillerod, Denmark (2007)Google Scholar
  67. 67.
    Sriram, S., Bhattacharyya, S.S.: Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, Inc., New York, NY, USA (2000)Google Scholar
  68. 68.
    Standard for information technology - portable operating system interface (POSIX). Shell and utilities. IEEE Std 1003.1-2004, The Open Group Base Specifications Issue 6, section 2.9: IEEE and The Open GroupGoogle Scholar
  69. 69.
    Synopsys: Synopsys Virtual Platforms. Visited on May 2009
  70. 70.
    T. Kempf, S.Wallentowitz, G. Ascheid, R. Leupers, and H. Meyr. RWTH Aachen University.: A Workbench for Analytical and Simulation based Design Space Exploration of Software Defined Radios. In: Accepted for VLSI Design Conference 2009. New Delhi, India (2009)Google Scholar
  71. 71.
    TI: OMAP35x Product Bulletin. Visited on Mar. 2009
  72. 72.
    TI: TI eXpressDSP Software and Development Tools. Visited on Jan. 2010
  73. 73.
    UMIC: Ultra high speed Mobile Information and Communication. Visited on Nov. 2009
  74. 74.
    Verdoolaege, S., Nikolov, H., Stefanov, T.: PN: A Tool for Improved Derivation of Process Networks. EURASIP J. Embedded Syst. 2007(1), 19–19 (2007)Google Scholar
  75. 75.
    Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenström, P.: The Worst-case Execution-time Problem - Overview of Methods and Survey of Tools. ACM Trans. Embed. Comput. Syst. 7(3), 1–53 (2008)CrossRefGoogle Scholar
  76. 76.
    Working Group ISO/IEC JTC1/SC22/WG14: C99, Programming Language C ISO/IEC9899:1999Google Scholar
  77. 77.
    Zalfany Urfianto, M., Isshiki, T., Ullah Khan, A., Li, D., Kunieda, H.: Decomposition of Task-Level Concurrency on C Programs Applied to the Design of Multiprocessor SoC. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E91-A(7), 1748–1756 (2008)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Rainer Leupers
    • 1
  • Weihua Sheng
    • 1
  • Jeronimo Castrillon
    • 1
  1. 1.Institute for Software for Systems on SiliconRWTH Aachen UniversityAachenGermany

Personalised recommendations