Journal of Signal Processing Systems

, Volume 75, Issue 2, pp 109–122 | Cite as

C++ Support and Applications for Embedded Multicore DSP Systems

  • Chi-Bang Kuan
  • Jia-Jhe Li
  • Chung-Kai Chen
  • Jenq-Kuen LeeEmail author


In recent years embedded systems have entered the multicore era. As the number of cores keeps growing in embedded systems, it becomes more important to provide programming support which considers embedded system constraints and in the meanwhile helps utilize multicore systems. So far though C still dominates embedded programming, C++ is gaining in importance in parallel programming. It is promising to support C++ for embedded multicore systems. However, embedded systems usually have tight resource budgets, and C++ is commonly considered having huge code size that embedded systems can not afford. Therefore, in this paper we investigate the code size requirement of a C++ library and propose a layered design to provide a code size aware library support. On the other hand, to utilize embedded multicore systems, we employ C++ linguistic features to facilitate embedded multicore programming. With C++, we incorporate high-level abstractions and design patterns into the programming support to enhance low-level programming APIs that can be used to exploit DSPs, SIMD instructions, and DMAs on embedded multicore systems. At last, we evaluate our C++ support with a Blur and a JPEG program. Our result on a dual-DSP platform shows that we can obtain speedups of 3.32 and 3.09 for the Blur and JPEG program, respectively.


Embedded multicore systems C++ library Programming models Parallel design patterns 



This work is supported in part by National Science Council (NSC) under grant no. 101-2220-E-007-001 and 101-2219-E-007-004 and by Ministry of Economic Affairs (MOEA) under grant no. 101-EC-17-A-02-S1-202 in Taiwan.


  1. 1.
    Bell, D., & Wood, G. (2009). Multicore programming guide, application report SPRAB27A. Texas Instruments.Google Scholar
  2. 2.
    Chang, D., Lin, T., Wu, C., Lee, J., Chu, Y., Wu, A. (2011). Parallel, Architecture Core (PAC) – the first multicore application processor SoC in Taiwan part I: hardware architecture & software development tools. Journal of Signal Processing Systems, 62(3), 373–382.CrossRefGoogle Scholar
  3. 3.
    Choi, Y., Lin, Y., Chong, N., Mahlke, S., Mudge, T. (2009). Stream compilation for real-time embedded multicore systems. In Code generation and optimization, 2009. CGO 2009. International symposium on (pp. 210–220). Seattle: IEEE.CrossRefGoogle Scholar
  4. 4.
    Embedded C++ Technical Committee (1999). The Embedded C++ specification.Google Scholar
  5. 5.
    Gregory, K. (2011). Overview and C++ AMP approach. Technical report. Microsoft, Providence.Google Scholar
  6. 6.
    Hsieh, K., Liu, Y., Wu, P., Chang, S., Lee, J. (2008). Enabling streaming remoting on embedded dual-core processors. In Parallel processing, 2008. ICPP’08. 37th international conference on (pp. 35–42). IEEE: Portland.CrossRefGoogle Scholar
  7. 7.
    Kajmowicz, G. uClibc++: an embedded C++ library.Google Scholar
  8. 8.
    Kale, L., & Krishnan, S. (1993). Charm++: a portable concurrent object oriented system based on C++. In ACM sigplan notices (Vol. 28, pp. 91–108).Google Scholar
  9. 9.
    Karam, L., AlKamal, I., Gatherer, A., Frantz, G., Anderson, D., Evans, B. (2009). Trends in multicore DSP platforms. Signal Processing Magazine, IEEE, 26(6), 38–49.CrossRefGoogle Scholar
  10. 10.
    Keutzer, K., & Mattson, T. (2010). A design pattern language for engineering (parallel) software. Intel Technology Journal, 13(4).Google Scholar
  11. 11.
    Kuan, C.B., & Lee, J.K. (2012). Compiler supports for VLIW DSP processors with SIMD intrinsics. Concurrency and Computation: Practice & Experience, 24(5), 517–532.CrossRefGoogle Scholar
  12. 12.
    Lebak, J., Kepner, J., Hoffmann, H., Rutledge, E. (2005). Parallel VSIPL++: An open standard software library for high-performance parallel signal processing. Proceedings of the IEEE, 93(2), 313–330.CrossRefGoogle Scholar
  13. 13.
    Lee, J., & Gannon, D. (1991). Object oriented parallel programming: experiments and results. In Proceedings of the 1991 ACM/IEEE conference on supercomputing (pp. 273–282). ACM.Google Scholar
  14. 14.
    Levy, M., & Conte, T. (2009). Embedded multicore processors and systems. Micro, IEEE, 29(3), 7–9.CrossRefGoogle Scholar
  15. 15.
    Lin, Y., Choi, Y., Mahlke, S., Mudge, T., Chakrabarti, C. (2008). A parameterized dataflow language extension for embedded streaming systems. In Embedded computer systems: architectures, modeling, and simulation, 2008. SAMOS 2008. International conference on (pp. 10–17). IEEE.Google Scholar
  16. 16.
    Lin, Y.C., You, Y.P., Lee, J.K. (2007). PALF: compiler supports for irregular register files in clustered VLIW DSP processors. Concurrency and Computation: practice & Experience, 19(18), 2391–2406.CrossRefGoogle Scholar
  17. 17.
    Linderman, M., Collins, J., Wang, H., Meng, T. (2008). Merge: a programming model for heterogeneous multi-core systems. In ACM SIGOPS operating systems review (Vol. 42, pp. 287–296).Google Scholar
  18. 18.
    Lu, C.H., Lin, Y.C., You, Y.P., Lee, J.K. (2009). LC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files. Concurrency and Computation: Practice & Experience, 21(1), 101–114.CrossRefGoogle Scholar
  19. 19.
    Mattson, T., Sanders, B., Massingill, B. (2004). Patterns for parallel programming. Addison-Wesley Professional.Google Scholar
  20. 20.
    Microsoft: Parallel patterns library 2010.Google Scholar
  21. 21.
    Newburn, C., So, B., Liu, Z., McCool, M., Ghuloum, A., Toit, S., Wang, Z., Du, Z., Chen, Y., Wu, G. (2011). Intel’s Array Building Blocks: a retargetable, dynamic compiler and embedded language. In Code generation and optimization (CGO), 2011 9th annual IEEE/ACM international symposium on (pp. 224–235). IEEE.Google Scholar
  22. 22.
    Pankratius, V., Schaefer, C., Jannesari, A., Tichy, W. (2008). Software engineering for multicore systems: an experience report. In Proceedings of the 1st international workshop on multicore software engineering (pp. 53–60). ACM.Google Scholar
  23. 23.
    Plauger, P. (1997). Embedded C++: an overview. Embedded Systems Programming, 10, 40–53.Google Scholar
  24. 24.
    Reinders, J. (2007). Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O’Reilly Media, Inc.Google Scholar
  25. 25.
    Wang, P., Collins, J., Chinya, G., Jiang, H., Tian, X., Girkar, M., Yang, N., Lueh, G., Wang, H. (2007) EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system. ACM SIGPLAN Notices, 42(6), 156–166.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Chi-Bang Kuan
    • 1
  • Jia-Jhe Li
    • 1
  • Chung-Kai Chen
    • 2
  • Jenq-Kuen Lee
    • 1
    Email author
  1. 1.Department of Computer ScienceNational Tsing Hua UniversityHsinchu CityTaiwan
  2. 2.Realtek Semiconductor Corp.Hsinchu Science ParkHsinchu CityTaiwan

Personalised recommendations