Embedded Multi-Core Systems Dedicated to Dynamic Dataflow Programs


Multimedia applications and embedded platforms are both becoming very complex in order to improve user experience. Thus, multimedia developers need high-level methods to automate time-consuming and error-prone tasks. Dynamic dataflow modeling is attractive to describe complex applications, such as video codecs, at a high level of abstraction. This paper presents a dataflow-based design approach to implement video codecs on embedded multi-core platforms. First, we introduce a custom architecture model to design low-power multi-core chips based on distributed memory and Transport-Triggered Architecture processor cores. Then, we describe software synthesis techniques to improve dynamic dataflow implementations. This methodology has been implemented into open-source tools and demonstrated on video decoders based on the MPEG-4 Visual standard and the new High Efficiency Video Coding standard. The simulations achieve real-time decoding (40FPS) of high definition (720P) MPEG-4 Visual video sequences on a custom multi-core platform clocked at 1Ghz, which is an improvement of more than 100 % over previously proposed implementations.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9


  1. 1.

    Abid, M., Jerbi, K., Raulet, M., Déforges, O., Abid, M. (2013). System Level Synthesis Of Dataflow Programs: HEVC Decoder Case Study. Electronic System Level Synthesis Conference (ESLsyn), 2013.

  2. 2.

    Bamakhrama, M.A., Zhai, J.T., Nikolov, H., Stefanov, T. (2012). A methodology for automated design of hard-real-time embedded streaming systems. In 2012 Proceedings of Design, Automation and Test in Europe, Conference and Exhibition (DATE), (pp. 941–946): IEEE.

  3. 3.

    Bebelis, V., Fradet, P., Girault, A., Lavigueur, B. (2013). BPDF: A Statically Analyzable DataFlow Model with Integer and Boolean Parameters. Embedded Software (EMSOFT), 2013 Proceedings of the International Conference on.

  4. 4.

    Bezati, E., Casale Brunet, S., Mattavelli, M., Janneck, J.W. (2013). Synthesis and Optimization of High-Level Stream Programs. Electronic System Level Synthesis Conference (ESLsyn), 2013.

  5. 5.

    Bhattacharya, B., & Bhattacharyya, S.S. (2001). Parameterized Dataflow Modeling for DSP Systems. IEEE Transactions on Signal Processing, 49 (10), 2408–2421.

    Article  MathSciNet  Google Scholar 

  6. 6.

    Bossen, F., Bross, B., Sühring, K., Flynn, D. (2013). HEVC Complexity and Implementation Analysis. IEEE Transactions on Circuits and Systems for Video Technology, 22 (12), 1685–1696.

    Article  Google Scholar 

  7. 7.

    Boutellier, J., Raulet, M., Silvén, O. (2013). Automatic Hierarchical Discovery of Quasi-Static Schedules of RVC- CAL Dataflow Programs. Journal of Signal Processing Systems, 71 (1), 35–40.

    Article  Google Scholar 

  8. 8.

    Cedersjö, G., & Janneck, J.W. (2012). Toward Efficient Execution of Dataflow Actors. Signals, Systems and Computers (ASILOMAR), 2012 Conference Record of the Forty Sixth Asilomar Conference on, (pp. 1465–1469).

  9. 9.

    Corporaal, H. (1997). Microprocessor Architectures: from VLIW to TTA. Chichester, UK: John Wiley & Sons.

  10. 10.

    Desnos, K., Pelcat, M., Bhattacharyya, S.S., Aridhi, S. (2013). PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration. Embedded Computer Systems (SAMOS), 2013 International Conference on.

  11. 11.

    Eker, J., & Janneck, J.W. (2003). CAL language report: Specification of the CAL actor language. Technical report. Berkeley: University of California.

    Google Scholar 

  12. 12.

    Ersfolk, J., Roquier, G., Lilius, J., Marco, M. (2012). Scheduling of dynamic dataflow programs based on state space analysis. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2012. ICASSP-12, (pp. 1661–1664).

  13. 13.

    Esko, O., Jääskeläinen, P., Huerta, P., de La Lama, C.S., Takala, J., Martinez, J.I. (2010). Customized Exposed Datapath Soft-Core Design Flow with Compiler Support. Proceedings of the 2010 International Conference on Field Programmable Logic and Applications, (pp. 217–222).

  14. 14.

    Gorin, J., Wipliez, M., Prêteux, F., Raulet, M. (2011). LLVM-based and scalable MPEG-RVC decoder. Journal of Real Time Image Processing, 6 (1), 59–70.

    Article  Google Scholar 

  15. 15.

    Wassim Hamidouche, Mickaël Raulet, Olivier Déforges. (2014). Parallel SHVC decoder: Implementation and analysis. Multimedia and Expo, 2014 IEEE International Conference on.

  16. 16.

    Kahn, G. (1974). The semantics of a simple language for parallel programming. Information processing, 74, 471–475.

    MathSciNet  Google Scholar 

  17. 17.

    Karypis, G., & Kumar, V. (1998). A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing, 20 (1), 359–392.

    Article  MathSciNet  Google Scholar 

  18. 18.

    Kultala, H., Esko, O., Jääskeläinen, P., Guzma, V., Takala, J., Xianjun, J., Zetterman, T., Berg, H. (2013). Turbo decoding on tailored OpenCL processor. 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), (pp. 1095–1100): IEEE.

  19. 19.

    Lee, E.A., & Messerschmitt, D.G. (1987). Synchronous data flow. Proceedings of the IEEE, 75 (9), 1235–1245.

    Article  Google Scholar 

  20. 20.

    Lee, E.A., & Parks, T. (1995). Dataflow process networks. Proceedings of the IEEE, 83 (5), 773–801.

    Article  Google Scholar 

  21. 21.

    Mattavelli, M., Raulet, M., Janneck, J.W. (2013). MPEG reconfigurable video coding In Bhattacharyya, S.S., Deprettere, E.F., Leupers, R., Takala, J. (Eds.), Handbook of Signal Processing Systems, (pp. 281–314). New York: Springer.

    Google Scholar 

  22. 22.

    Mische, J., Metzlaff, S., Ungerer, T. (2014). Distributed Memory on ChipBringing Together Low Power and Real-Time. Technical report: University of Augsburg.

  23. 23.

    Richardson, I.E.G. (2003). H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia. New York: John Wiley & Sons Inc.

  24. 24.

    Roquier, G., Wipliez, M., Raulet, M., Janneck, J., Miller, I.D., Parlour, D.B. (2008). Automatic software synthesis of dataflow program: An MPEG-4 simple profile decoder case study. Signal Processing Systems, 2008. SiPS 2008. IEEE Workshop on, (pp. 281–286).

  25. 25.

    Siret, N., Wipliez, M., Nezan, J.-F., Palumbo, F. (2012). Generation of Efficient High-Level Hardware Code from Dataflow Programs. Proceedings of Design, Automation and Test in Europe (DATE).

  26. 26.

    Stuijk, S., Basten, T., Akesson, B., Geilen, M., Moreira, O., Reineke, J. (2011). Designing next-generation real-time streaming systems. Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis - CODES+ISSS ’11, (pp. 3–4).

  27. 27.

    Sullivan, G.J., Ohm, J.-R., Han, W.-J., Wiegand, T. (2012). Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Transactions on Circuits and Systems for Video Technology, 22 (12), 1649–1668.

    Article  Google Scholar 

  28. 28.

    Matthieu, W., & Raulet, M. (2012). Classification of Dataflow Actors with Satisfiability and Abstract Interpretation. International Journal of Embedded and Real-Time Communication Systems, 3 (March), 49–69.

    Google Scholar 

  29. 29.

    Wipliez, M., Roquier, G., Nezan, J.-F. (2009). Software Code Generation for the RVC-CAL Language. Journal of Signal Processing Systems, 63 (2), 203–213.

    Article  Google Scholar 

  30. 30.

    Yviquel, H., Boutellier, J., Raulet, M., Emmanuel, C. (2013). Automated design of networks of Transport-Triggered Architecture processors using Dynamic Dataflow Programs. Signal Processing Image Communication, 28 (10), 1295–1302.

    Article  Google Scholar 

  31. 31.

    Yviquel, H., Casseau, E., Raulet, M., Jääskeläinen, P., Takala, J. (2013). Towards run-time actor mapping of dynamic dataflow programs onto multi-core platforms. Image and Signal Processing and Analysis (ISPA), 2013 8th International Symposium on, (pp. 732–737).

  32. 32.

    Yviquel, H., Casseau, E., Wipliez, M., Raulet, M. (2011). Efficient multicore scheduling of dataflow process networks. Signal Processing Systems (SiPS), 2011 IEEE Workshop on, (pp. 198–203).

  33. 33.

    Yviquel, H., Lorence, A., Jerbi, K., Sanchez, A., Cocherel, G., Mickaël, R. (2013). Proceedings of the 21st ACM international conference on Multimedia, 863–866.

  34. 34.

    Yviquel, H., Sanchez, A., Jääskeläinen, P., Takala, J., Raulet, M., Casseau, E. (2014). Efficient Software Synthesis of Dynamic Dataflow Programs. Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on.

Download references

Author information



Corresponding author

Correspondence to Hervé Yviquel.

Additional information

We would like to thank the organizations which have partially funded this work such as the Center for International Mobility (CIMO) and the Academy of Finland (funding decision 253087). We would also give special thanks to the Orcc and TCE communities as a whole for actively participating in the development of the tools which offers solid basements to this work.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yviquel, H., Sanchez, A., Jääskeläinen, P. et al. Embedded Multi-Core Systems Dedicated to Dynamic Dataflow Programs. J Sign Process Syst 80, 121–136 (2015). https://doi.org/10.1007/s11265-014-0953-5

Download citation


  • Dataflow programming
  • Video coding
  • HEVC
  • Embedded system