Effective file data-block placement for different types of page cache on hybrid main memory architectures

Abstract

Hybrid main memory architectures employing both DRAM and non-volatile memories (NVMs) are becoming increasingly attractive due to the opportunities for exploring benefits of various memory technologies, for example, high speed writes on DRAM and low stand-by power consumption on NVMs. File data-block placement (FDP) on different types of page cache is one of the important problems that directly impact the performance and cost of file operations on a hybrid main memory architecture. Page cache is widely used in modern operating systems to expedite file I/O by mapping disk-backed file data-blocks in main memory to process space in virtual memory. In a hybrid main memory, different types of memory with different read/write costs can be allocated as page cache by operating system. In this paper, we study the problem of file data-block placement on different types of page cache to minimize the total cost of file accesses in a program. We propose a dynamic programming algorithm, the FDP Algorithm, to solve the problem optimally for simple programs. We develop an ILP model for the file data-block placement problem for programs composed of multiple regions with data dependencies. An efficient heuristic, the global file data-block placement (GFDP) Algorithm, is proposed to obtain near-optimal solutions for the problem of global file data-block placement on hybrid main memory. Experiments on a set of benchmarks show the effectiveness of the GFDP algorithm compared with a greedy strategy and the ILP. Experimental results show that the GFDP algorithm reduces the total cost of file accesses by \(51.3~\%\) on average compared with the the greedy strategy.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Absar MJ, Catthoor F (2005) Compiler-based approach for exploiting scratch-pad in presence of irregular array access. In: Proceedings of IEEE design, automation and test in Europe, pp 1162–1167

  2. 2.

    Avissar O, Barua R, Stewart D (2002) An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans Embed Comput Syst 1(1):6–26

    Article  Google Scholar 

  3. 3.

    Bez R (2009) Chalcogenide pcm: a memory technology for next decade. In: IEEE international electron devices meeting (IEDM), pp 1–4.

  4. 4.

    Bryant R, David Richard O (2003) Computer systems: a programmer’s perspective. Prentice Hall, Upper Saddle River

    Google Scholar 

  5. 5.

    Chang YH, Lin JH, Hsieh JW, Kuo TW (2010) A strategy to emulate nor flash with nand flash. ACM Tran Storage 6(2):5

    Google Scholar 

  6. 6.

    Chen CH, Hsiu PC, Kuo TW, Yang CL, Wang CY (2012) Age-based pcm wear leveling with nearly zero search cost. In: 49th ACM/EDAC/IEEE design automation conference (DAC), pp 453–458.

  7. 7.

    Chen G, Ozturk O, Kandemir M, Karakoy M (2006) Dynamic scratch-pad memory management for irregular array access patterns. In: Proceedings of the conference on design, automation and test in Europe: Proceedings, European Design and Automation Association, pp 931–936.

  8. 8.

    Dhiman G, Ayoub R, Rosing T (2009) Pdram: a hybrid pram and dram main memory system. In: 46th ACM/IEEE DAC’09 design automation conference, pp 664–669.

  9. 9.

    Gorman M (2004) Understanding the Linux virtual memory manager. Prentice Hall, Upper Saddle River

    Google Scholar 

  10. 10.

    Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: IEEE international workshop on workload characterization, WWC-4, pp 3–14.

  11. 11.

    Harty K, Cheriton DR (1992) Application-controlled physical memory using external page-cache management. ACM, New York

    Google Scholar 

  12. 12.

    Hosomi M, Yamagishi H, Yamamoto T, Bessho K, Higo Y, Yamane K,Yamada H, Shoji M, Hachino H, Fukumoto C, et al (2005) A novel nonvolatile memory with spin torque transfer magnetization switching: Spin-ram. In: IEEE international electron devices meeting, 2005. IEDM Technical Digest, pp 459–462

  13. 13.

    Hu J, Xue CJ, Zhuge Q, Tseng WC, Sha EM (2011) Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In: IEEE design, automation and test in Europe conference and exhibition (DATE), pp 1–6.

  14. 14.

    Hu J, Xie M, Pan C, Zhuge Q, Sha EM (2014) Low overhead software wear leveling for hybrid pcm + dram main memory on embedded systems. IEEE Trans Very Large Scale Syst 41:1

  15. 15.

    Hu J, Xue C, Zhuge Q, Tseng W, Sha EM (2013) Write activity reduction on non-volatile main memories for embedded chip multiprocessors. ACM Trans Embed Comput Syst 12(3):77

    Article  Google Scholar 

  16. 16.

    Hu J, Xue C, Zhuge Q, Tseng W, Sha EM (2013) Scheduling to optimize cache utilization for non-volatile main memories. IEEE Trans Comput 55:1

    Google Scholar 

  17. 17.

    Huang PC, Chang YH, Kuo TW (2012) Joint management of ram and flash memory with access pattern considerations. In: 49th ACM/EDAC/IEEE design automation conference (DAC), pp 882–887.

  18. 18.

    Karpovich JF, Grimshaw AS, French JC (1994) Extensible file system (elfs): an object-oriented approach to high performance file i/o. ACM SIGPLAN Not 29(10):191–204

    Article  Google Scholar 

  19. 19.

    Kesavan M, Gavrilovska A, Schwan K (2010) On disk i/o scheduling in virtual machines. In: Proceedings of the 2nd conference on I/O virtualization, USENIX Association, p 6

  20. 20.

    Liu D, Wang T, Wang Y, Qin Z, Shao Z (2012) A block-level flash memorymanagement scheme for reducing write activities in pcm-based embedded systems. In: Proceedings of the conference on design, automation and test in Europe, EDA Consortium, pp 1447–1450

  21. 21.

    Liu D, Wang T, Wang Y, Shao Z, Zhuge Q, Sha EHM (2013) Curling-pcm: application-specific wear leveling for phase change memory based embedded systems. In: ASP-DAC, pp 279–284.

  22. 22.

    Liu D, Wang T, Wang Y, Qin Z, and Shao Z (2011) Pcm-ftl: a write-activity-aware nand flash memory management scheme for pcm-based embedded systems. In 2011 IEEE real-time systems symposium (RTSS), pp 357–366.

  23. 23.

    Liu T, Zhao Y, Xue CJ, Li M (2011) Power-aware variable partitioning for dsps with hybrid pram and dram main memory. In: 48th ACM/EDAC/IEEE design automation conference (DAC), pp 405–410.

  24. 24.

    Ousterhout J, Douglis F (1989) Beating the i/o bottleneck: a case for log-structured file systems. ACM SIGOPS Oper Syst Rev 23(1):11–28

    Article  Google Scholar 

  25. 25.

    Ozturk O, Kandemir M, Narayanan SHK (2008) A scratch-pad memory aware dynamic loop scheduling algorithm. In: 9th International symposium on IEEE quality electronic design, ISQED, pp 738–743.

  26. 26.

    Panda PR, Dutt ND, Nicolau A (1997) Efficient utilization of scratch-pad memory in embedded processor applications. In: Proceedings of the 1997 European conference on design and test, IEEE computer society, p 7.

  27. 27.

    Park E, Egger B, Lee J (2011) Fast and space-efficient virtual machine checkpointing. ACM SIGPLAN Not 46:75–86

    Article  Google Scholar 

  28. 28.

    Park H, Yoo S, Lee S (2011) Power management of hybrid dram/pram-based main memory. In: Proceedings of the 48th design automation conference, ACM, pp 59–64.

  29. 29.

    Qureshi MK, Srinivasan V, Rivers JA (2009) Scalable high performance main memory system using phase-change memory technology. ACM SIGARCH Comput Archit News 37(3):24–33

    Article  Google Scholar 

  30. 30.

    Shao Z, Liu Y, Chen Y, Li T (2012) Utilizing pcm for energy optimization in embedded systems. In: IEEE computer society annual symposium on VLSI (ISVLSI), pp 398–403.

  31. 31.

    Shi L, Xue CJ, Hu J, Tseng WC, Zhou X, Sha EHM (2010) Write activity reduction on flash main memory via smart victim cache. In: Proceedings of the 20th symposium on Great lakes symposium on VLSI, ACM, pp 91–94.

  32. 32.

    Shi L, Xue CJ, Zhou X (2011) Cooperating write buffer cache and virtual memory management for flash memory based systems. In: 17th IEEE real-time and embedded technology and applications symposium (RTAS), pp 147–156.

  33. 33.

    Udayakumaran S, Barua R (2006) An integrated scratch-pad allocator for affine and non-affine code. In: Proceedings of the conference on design, automation and test in Europe: Proceedings, European design and automation association, pp 925–930.

  34. 34.

    Udayakumaran S, Dominguez A, Barua R (2006) Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Trans Embed Comput Syst 5(2):472–511

    Article  Google Scholar 

  35. 35.

    Wang Y, Shao Z, Chan H, Bathen L, Dutt N (2014) A reliability enhanced address mapping strategy for three-dimensional (3D) nand flash memory. IEEE Trans Very Large Scale Syst 29(99):1

    MATH  Google Scholar 

  36. 36.

    Zhuge Q, Guo Y, Hu J, Tseng WC, Xue S, Sha EM (2012) Minimizing access cost for multiple types of memory units in embedded systems through data allocation and scheduling. IEEE Trans Signal Process 60(6):3253–3263

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work is partially supported by National 863 Program 2013AA013202, Chongqing High-Tech Research Program csct2012ggC40005, NSFC 61472052, NSFC 61173014.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Qingfeng Zhuge.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dai, P., Zhuge, Q., Chen, X. et al. Effective file data-block placement for different types of page cache on hybrid main memory architectures. Des Autom Embed Syst 17, 485–506 (2013). https://doi.org/10.1007/s10617-014-9148-3

Download citation

Keywords

  • Hybrid main memory
  • Page cache
  • File data-block placement