A Semi-automatic Scratchpad Memory Management Framework for CMP

  • Ning Deng
  • Weixing Ji
  • Jaxin Li
  • Qi Zuo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6965)


Previous research has demonstrated that scratchpad memory(SPM) consumes far less power and on-chip area than the traditional cache. As a software managed memory, SPM has been widely adopted in today’s mainstream embedded processors. Traditional SPM allocation strategies depend on either the compiler or the programmer to manage the small memory. The former methods predict the frequently referenced data items before real running by static analysis or profiling, whereas the latter methods require the programmer to manually allocate the SPM space. As for the dynamic heap data allocation, there is no mature allocation scheme for multicore processors with a shared software-managed on-chip memory. This paper presents a novel SPM management framework, for chip multiprocessors (CMP) featuring partitioned global address space (PGAS) SPM memory architecture. The most frequently referenced heap data are maintained in the SPM. This framework mitigates the SPM allocation problem by leveraging the programmer’s hints to determine the data items allocated to the SPM. The complex and error-prone allocation procedure is completely handled by an SPM management library (SPMMLIB) without programmer’s conscious. The performance is evaluated in a homogenous UltraSPARC multiprocessor using PARSEC and SPLASH2 benchmarks. Experimental results indicate that, on average, the energy consumption is reduced by 22.4% compared with the cache memory architecture.


Local Memory Scratchpad Memory Start Address High Performance Computer Architecture MPSoC Architecture 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anzt, H., Hahn, T., Heuveline, V., Rocker, B.: GPU Accelerated Scientific Computing: Evaluation of the NVIDIA Fermi Architecture; Elementary Kernels and Linear Solvers. EMCL Preprint Series (2010)Google Scholar
  2. 2.
    Bai, K., Shrivastava, A.: Heap data management for limited local memory (llm) multi-core processors. In: Proceedings of the Eighth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES/ISSS 2010, pp. 317–326. ACM, New York (2010)CrossRefGoogle Scholar
  3. 3.
    Banakar, R., Steinke, S., Lee, B.-S., Balakrishnan, M., Marwedel, P.: Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In: Proceedings of the Tenth International Symposium on Hardware/Software Codesign, CODES 2002, pp. 73–78. ACM, New York (2002)CrossRefGoogle Scholar
  4. 4.
    Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: Characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 72–81. ACM, New York (2008)CrossRefGoogle Scholar
  5. 5.
    Cho, D., Pasricha, S., Issenin, I., Dutt, N.D., Ahn, M., Paek, Y.: Adaptive scratch pad memory management for dynamic behavior of multimedia applications. Trans. Comp.-Aided Des. Integ. Cir. Sys. 28, 554–567 (2009)CrossRefGoogle Scholar
  6. 6.
    Cho, H., Egger, B., Lee, J., Shin, H.: Dynamic data scratchpad memory management for a memory subsystem with an mmu. SIGPLAN Not. 42(7), 195–206 (2007)CrossRefGoogle Scholar
  7. 7.
    Deng, N., Ji, W., Li, J., Zuo, Q., Shi, F.: Core Working Set Based Scratchpad Memory Management. IEICE Transactions on Information and Systems 94(2), 274–285 (2011)CrossRefGoogle Scholar
  8. 8.
    Deng, N., Ji, W., Li, J., Shi, F., Wang, Y.: A novel adaptive scratchpad memory management strategy. In: Proceedings of the 2009 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2009, pp. 236–241. IEEE Computer Society, Washington, DC, USA (2009)CrossRefGoogle Scholar
  9. 9.
    Dominguez, A.: Heap data allocation to scratch-pad memory in embedded systems. Technical report, University of Maryland at College Park, College Park, MD, USA, AAI3260306 (2007)Google Scholar
  10. 10.
    Egger, B., Lee, J., Shin, H.: Scratchpad memory management for portable systems with a memory management unit. In: EMSOFT 2006: Proceedings of the 6th ACM & IEEE International Conference on Embedded Software, pp. 321–330. ACM, New York (2006)Google Scholar
  11. 11.
    Egger, B., Lee, J., Shin, H.: Scratchpad memory management in a multitasking environment. In: EMSOFT, pp. 265–274 (2008)Google Scholar
  12. 12.
    Ji, W., Deng, N., Shi, F., Zuo, Q., Li, J.: Dynamic and adaptive spm management for a multi-task environment. J. Syst. Archit. 57, 181–192 (2011)CrossRefGoogle Scholar
  13. 13.
    Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.: Introduction to the cell multiprocessor. IBM J. Res. Dev. 49, 589–604 (2005)CrossRefGoogle Scholar
  14. 14.
    Lee, J., Lee, J., Seo, S., Kim, J., Kim, S., Sura, Z.: COMIC++: A software SVM system for heterogeneous multicore accelerator clusters. In: 2010 IEEE 16th International Symposium on High Performance Computer Architecture (HPCA 2010), pp. 1–12. IEEE, Los Alamitos (2010)Google Scholar
  15. 15.
    Lee, J., Seo, S., Kim, C., Kim, J., Chun, P., Sura, Z., Kim, J., Han, S.: Comic: a coherent shared memory interface for cell be. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 303–314. ACM, New York (2008)Google Scholar
  16. 16.
    Leupers, R.: Processor and system-on-chip simulation. Springer, Heidelberg (2010)CrossRefzbMATHGoogle Scholar
  17. 17.
    Li, L., Feng, H., Xue, J.: Compiler-directed scratchpad memory management via graph coloring. ACM Trans. Archit. Code Optim. 6, 9:1–9:17 (2009)Google Scholar
  18. 18.
    Li, L., Xue, J., Knoop, J.: Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs. ACM Trans. Embed. Comput. Syst. 10, 28:1–28:42 (2011)Google Scholar
  19. 19.
    Marongiu, A., Benini, L.: An OpenMP Compiler for Efficient Use of Distributed Scratchpad Memory in MPSoCs. IEEE Transactions on Computers (2010)Google Scholar
  20. 20.
    McIlroy, R., Dickman, P., Sventek, J.: Efficient dynamic heap allocation of scratch-pad memory. In: ISMM 2008: Proceedings of the 7th International Symposium on Memory Management, pp. 31–40. ACM, New York (2008)Google Scholar
  21. 21.
    Seo, S., Lee, J., Sura, Z.: Design and implementation of software-managed caches for multicores with local memory. In: IEEE 15th International Symposium on High Performance Computer Architecture, HPCA 2009, pp. 55–66. IEEE, Los Alamitos (2009)CrossRefGoogle Scholar
  22. 22.
    Takase, H., Tomiyama, H., Takada, H.: Partitioning and allocation of scratch-pad memory for priority-based preemptive multi-task systems. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2010, pp. 1124–1129. European Design and Automation Association, Leuven (2010)CrossRefGoogle Scholar
  23. 23.
    Thoziyoor, S., Muralimanohar, N., Ahn, J.H., Jouppi, N.P.: CACTI 5.3, HP Laboratories Palo Alto (2009)Google Scholar
  24. 24.
    Villavieja, C., Gelado, I., Ramirez, A., Navarro, N.: Memory Management on Chip-MultiProcessors with on-chip Memories. In: Proc. Workshop on the Interaction between Operating Systems and Computer Architecture (2008)Google Scholar
  25. 25.
    Walleij, L.: Arm tcm (tightly-coupled memory) support v3 (2009)Google Scholar
  26. 26.
    Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 24–36. ACM, New York (1995)Google Scholar
  27. 27.
    Yang, X., Wang, L., Xue, J., Tang, T., Ren, X., Ye, S.: Improving scratchpad allocation with demand-driven data tiling. In: Proceedings of the 2010 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2010, pp. 127–136. ACM, New York (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ning Deng
    • 1
  • Weixing Ji
    • 1
  • Jaxin Li
    • 1
  • Qi Zuo
    • 1
  1. 1.Beijing Institute of TechnologyBeijingChina

Personalised recommendations