Profiling and annotation combined method for multimedia application specific MPSoC performance estimation

  • Kai Huang
  • Xiao-xu Zhang
  • Si-wen XiuEmail author
  • Dan-dan Zheng
  • Min Yu
  • De Ma
  • Kai Huang
  • Gang Chen
  • Xiao-lang Yan


Accurate and fast performance estimation is necessary to drive design space exploration and thus support important design decisions. Current techniques are either time consuming or not accurate enough. In this paper, we solve these problems by presenting a hybrid method for multimedia multiprocessor system-on-chip (MPSoC) performance estimation. A general coverage analysis tool GNU gcov is employed to profile the execution statistics during the native simulation. To tackle the complexity and keep the analysis and simulation manageable, the orthogonalization of communication and computation parts is adopted. The estimation result of the computation part is annotated to a transaction accurate model for further analysis, by which a gradual refinement of MPSoC performance estimation is supported. The implementation and its experimental results prove the feasibility and efficiency of the proposed method.

Key words

MPSoC Gradual refinement Native simulation Performance estimation Profiling Annotation Gcov 

CLC number

TP36 TN47 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. ARM, 2003. AMBA Axi Protocol Specification v1.0.Google Scholar
  2. Benini, L., Bertozzi, D., Bogliolo, A., et al., 2005. MPARM: exploring the multi-processor SoC design space with SystemC. J. VLSI Signal Process. Syst. Signal Image Video Technol., 41(2):169–182. [doi:10.1007/s11265-005-6648-1]CrossRefGoogle Scholar
  3. Cesário, W.O., Nicolescu, G., Gauthier, L., et al., 2001. Colif: a design representation for application-specific multiprocessor SoCs. IEEE Des. Test Comput., 18(5):8–20. [doi:10.1109/54.953268]CrossRefGoogle Scholar
  4. C-SKY Microsystems, 2013. Ck803 Introduction. Available from Scholar
  5. Filho, S.J., Aguiar, A., Marcon, C.A., et al., 2008. Highlevel estimation of execution time and energy consumption for fast homogeneous MPSoCs prototyping. 19th IEEE/IFIP Int. Symp. on Rapid System Prototyping, p.27–33. [doi:10.1109/RSP.2008.25]Google Scholar
  6. Fummi, F., Martini, S., Perbellini, G., et al., 2004. Native ISS-SystemC integration for the co-simulation of multi-processor SoC. Proc. Design, Automation and Test in Europe Conf. and Exhibition, p.564–569. [doi:10.1109/DATE.2004.1268905]CrossRefGoogle Scholar
  7. Gao, L., Karuri, K., Kraemer, S., et al., 2008. Multiprocessor performance estimation using hybrid simulation. Proc. 45th Annual Design Automation Conf., p.325–330. [doi:10.1145/1391469.1391552]CrossRefGoogle Scholar
  8. Gerin, P., Guerin, X., Pétrot, F., 2008. Efficient implementation of native software simulation for MPSoC. Proc. Design, Automation and Test in Europe, p.676–681. [doi:10.1109/DATE.2008.4484756]Google Scholar
  9. Gerin, P., Hamayun, M.M., Pétrot, F., 2009. Native MPSoC co-simulation environment for software performance estimation. Proc. 7th IEEE/ACM Int. Conf. on Hardware/Software Codesign and System Synthesis, p.403–412. [doi:10.1145/1629435.1629490]Google Scholar
  10. GNU, 2013. gcov—a Test Coverage Program. Available from Scholar
  11. Han, S.I., Baghdadi, A., Bonaciu, M., et al., 2004. An efficient scalable and flexible data transfer architecture for multiprocessor SoC with massive distributed memory. Proc. 41st Annual Design Automation Conf., p.250–255. [doi:10.1145/996566.996636]CrossRefGoogle Scholar
  12. Han, S.I., Chae, S.I., Jarraya, A.A., 2006. Functional modeling techniques for efficient SW code generation of video codec applications. Proc. Asia and South Pacific Design Automation Conf., p.935–940. [doi:10.1109/ASPDAC.2006.1594806]Google Scholar
  13. Han, S.I., Chae, S.I., Brisolara, L., et al., 2009. Simulinkbased heterogeneous multiprocessor SoC design flow for mixed hardware/software refinement and simulation. Integr. VLSI J., 42(2):227–245. [doi:10.1016/j.vlsi.2008.08.003]CrossRefGoogle Scholar
  14. Henia, R., Hamann, A., Jersak, M., et al., 2005. System level performance analysis—the SymTA/S approach. IEE Proc.-Comput. Dig. Tech., 152(2):148–166. [doi:10.1049/ip-cdt:20045088]CrossRefGoogle Scholar
  15. Huang, K., Han, S.I., Popovici, K., et al., 2007. Simulinkbased MPSoC design flow: case study of Motion-JPEG and H.264. Proc. 44th Annual Conf. on Design Automation, p.39–42. [doi:10.1145/1278480.1278491]CrossRefGoogle Scholar
  16. Huang, K., Yan, X.L., Han, S.I., et al., 2009. Gradual refinement for application-specific MPSoC design from Simulink model to RTL implementation. J. Zhejiang Univ.-Sci. A, 10(2):151–164. [doi:10.1631/jzus.A0820085]CrossRefGoogle Scholar
  17. Huang, K., Haid, W., Bacivarov, I., et al., 2012. Embedding formal performance analysis into the design cycle of MPSoCs for real-time streaming applications. ACM Trans. Embed. Comput. Syst., 11(1), Article 8. [doi:10.1145/2146417.2146425]CrossRefGoogle Scholar
  18. Jerraya, A., Wolf, W., 2004. Multiprocessor Systems-on-Chips. Elsevier.Google Scholar
  19. Jerraya, A.A., Bouchhima, A., Petrot, F., 2006. Programming models and HW-SW interfaces abstraction for multi-processor SoC. 43rd ACM/IEEE Design Automation Conf., p.280–285. [doi:10.1109/DAC.2006.229246]Google Scholar
  20. Karuri, K., Al Faruque, M.A., Kraemer, S., et al., 2005. Fine-grained application source code profiling for ASIP design. Proc. 42nd Design Automation Conf., p.329–334. [doi:10.1109/DAC.2005.193827]Google Scholar
  21. Keutzer, K., Newton, A.R., Rabaey, J.M., et al., 2000. System-level design: orthogonalization of concerns and platform-based design. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 19(12):1523–1543. [doi:10.1109/43.898830]CrossRefGoogle Scholar
  22. Kienhuis, B., Deprettere, E., Vissers, K., et al., 1997. An approach for quantitative analysis of application-specific dataflow architectures. Proc. IEEE Int. Conf. on Application-Specific Systems, Architectures and Processors, p.338–349. [doi:10.1109/ASAP.1997.606839]CrossRefGoogle Scholar
  23. Kirchsteiger, C.M., Schweitzer, H., Trummer, C., et al., 2008. A software performance simulation methodology for rapid system architecture exploration. 15th IEEE Int. Conf. on Electronics, Circuits and Systems, p.494–497. [doi:10.1109/ICECS.2008.4674898]Google Scholar
  24. Madl, G., Dutt, N., Abdelwahed, S., 2007. Performance estimation of distributed real-time embedded systems by discrete event simulations. Proc. 7th ACM & IEEE Int. Conf. on Embedded Software, p.183–192. [doi:10.1145/1289927.1289958]CrossRefGoogle Scholar
  25. Oyamada, M., Wagner, F.R., Bonaciu, M., et al., 2007. Software performance estimation in MPSoC design. Proc. Asia and South Pacific Design Automation Conf., p.38–43. [doi:10.1109/ASPDAC.2007.357789]Google Scholar
  26. Oyamada, M., Zschornack, F., Wagner, F., 2008. Applying neural networks to performance estimation of embedded software. J. Syst. Archit., 54(1–2):224–240. [doi:10.1016/j.sysarc.2007.06.005]CrossRefGoogle Scholar
  27. Patel, R., Rajawat, A., 2011. A survey of embedded software profiling methodologies. Int. J. Embed. Syst. Appl., 1(2):19–40. [doi:10.5121/ijesa.2011.1203]Google Scholar
  28. Piscitelli, R., Pimentel, A.D., 2012. Interleaving methods for hybrid system-level MPSoC design space exploration. Int. Conf. on Embedded Computer Systems, p.7–14. [doi:10.1109/SAMOS.2012.6404152]Google Scholar
  29. Posadas, H., Herrera, F., Sanchez, P., et al., 2004. Systemlevel performance analysis in SystemC. Proc. Design, Automation and Test in Europe Conf. and Exhibition, 1:378–383. [doi:10.1109/DATE.2004.1268876]CrossRefGoogle Scholar
  30. Richter, K., Jersak, M., Ernst, R., 2003. A formal approach to MPSoC performance verification. Computer, 36(4):60–67. [doi:10.1109/MC.2003.1193230]CrossRefGoogle Scholar
  31. Schnerr, J., Bringmann, O., Viehl, A., et al., 2008. Highperformance timing simulation of embedded software. Proc. 45th Annual Design Automation Conf., p.290–295. [doi:10.1145/1391469.1391543]CrossRefGoogle Scholar
  32. Shen, H., Hamayun, M., Petrot, F., 2012. Native simulation of MPSoC using hardware-assisted virtualization. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 31(7):1074–1087. [doi:10.1109/TCAD.2012.2187526]CrossRefGoogle Scholar
  33. Wandeler, E., Thiele, L., Verhoef, M., et al., 2006. System architecture evaluation using modular performance analysis: a case study. Int. J. Softw. Tools Technol. Transfer, 8(6):649–667. [doi:10.1007/s10009-006-0019-5]CrossRefGoogle Scholar
  34. Wilhelm, R., Engblom, J., Ermedahl, A., et al., 2008. The worst-case execution-time problem—overview of methods and survey of tools. ACMTrans. Embed. Comput. Syst., 7(3):36. [doi:10.1145/1347375.1347389]Google Scholar
  35. Yang, H., Kim, S., Ha, S., 2010. An MILP-based performance analysis technique for non-preemptive multitasking MPSoC. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 29(10):1600–1613. [doi:10.1109/TCAD.2010.2061552]CrossRefGoogle Scholar

Copyright information

© Journal of Zhejiang University Science Editorial Office and Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Kai Huang
    • 1
  • Xiao-xu Zhang
    • 1
  • Si-wen Xiu
    • 2
    Email author
  • Dan-dan Zheng
    • 1
  • Min Yu
    • 1
  • De Ma
    • 3
  • Kai Huang
    • 4
  • Gang Chen
    • 4
  • Xiao-lang Yan
    • 1
  1. 1.Institute of VLSI DesignZhejiang UniversityHangzhouChina
  2. 2.College of Optical and Electronic TechnologyChina Jiliang UniversityHangzhouChina
  3. 3.Microelectronics CAD Center, MOE Key Lab of RF Circuits and SystemsHangzhou Dianzi UniversityHangzhouChina
  4. 4.Department of Informatics VITechnical University MunichGarchingGermany

Personalised recommendations