Skip to main content

Superpage-Friendly Page Table Design for Hybrid Memory Systems

  • Conference paper
  • First Online:
Data Science (ICPCSEE 2020)

Abstract

Page migration has long been adopted in hybrid memory systems comprising dynamic random access memory (DRAM) and non-volatile memories (NVMs), to improve the system performance and energy efficiency. However, page migration introduces some side effects, such as more translation lookaside buffer (TLB) misses, breaking memory contiguity, and extra memory accesses due to page table updating. In this paper, we propose superpage-friendly page table called SuperPT to reduce the performance overhead of serving TLB misses. By leveraging a virtual hashed page table and a hybrid DRAM allocator, SuperPT performs address translations in a flexible and efficient way while still remaining the contiguity within the migrated pages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Remainder is a simple hash function design with remainder operation.

  2. 2.

    Wyhash [35] is a fast hash function on x86-64 without quality problems.

References

  1. Dhiman, G., Ayoub, R., Rosing, T.: PDRAM: a hybrid pram and dram main memory system. In: Proceedings of the 46th Annual Design Automation Conference, pp. 664–469. ACM, New York (2009)

    Google Scholar 

  2. Qureshi, M.K., Srinivasan, V., Rivers, J.A.: Scalable high performance main memory system using phase-change memory technology. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp. 24–33. ACM, New York (2009)

    Google Scholar 

  3. Ramos, L.E., Gorbatov, E., Bianchini, R.: Page placement in hybrid memory systems. In: Proceedings of the International Conference on Supercomputing, pp. 85–95. ACM, New York (2011)

    Google Scholar 

  4. Liu, H., et al.: Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures. In: Proceedings of the International Conference on Supercomputing, pp. 26:1–26:10. ACM, New York (2017)

    Google Scholar 

  5. Wang, X., et al.: Supporting superpages and lightweight page migration in hybrid memory systems. ACM Trans. Archit. Code Optim. 16(2), 11:1–11:26 (2019)

    Google Scholar 

  6. Bhattacharjee, A.: Large-reach memory management unit caches. In: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 383–394. ACM, New York (2013)

    Google Scholar 

  7. Romer, T.H., Ohlrich, W.H., Karlin, A.R., Bershad, B.N.: Reducing TLB and memory overhead using online superpage promotion. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 176–187. ACM, New York (1995)

    Google Scholar 

  8. Talluri, M., Hill, M.D.: Surpassing the TLB performance of superpages with less operating system support. In: Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 171–182. ACM, New York (1994)

    Google Scholar 

  9. Swanson, M., Stoller, L., Carter, J.: Increasing TLB reach using superpages backed by shadow memory. In: Proceedings of the 25th Annual International Symposium on Computer Architecture, pp. 204–213. IEEE Computer Society, Washington, DC (1998)

    Google Scholar 

  10. Pham, B., Vaidyanathan, V., Jaleel, A., Bhattacharjee, A.: Colt: coalesced large-reach TLBs. In: Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 258–269. IEEE Computer Society, Washington, DC (2012)

    Google Scholar 

  11. Pham, B., Bhattacharjee, A., Eckert, Y., Loh, G.H.: Increasing TLB reach by exploiting clustering in page translations. In: Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture, pp. 558–567. IEEE Computer Society, Washington, DC (2014)

    Google Scholar 

  12. Pham, B., Veselý, J., Loh, G.H., Bhattacharjee, A.: Large pages and lightweight memory management in virtualized environments: can you have it both ways? In: Proceedings of the 48th International Symposium on Microarchitecture, pp. 1–12. ACM, New York (2015)

    Google Scholar 

  13. Gandhi, J., et al.: Range translations for fast virtual memory. IEEE Micro 36(3), 118–126 (2016)

    Article  MathSciNet  Google Scholar 

  14. Yan, Z., Lustig, D., Nellans, D., Bhattacharjee, A.: Translation ranger: operating system support for contiguity-aware TLBs. In: Proceedings of the 46th International Symposium on Computer Architecture, pp. 698–710. ACM, New York (2019)

    Google Scholar 

  15. Karakostas, V., et al.: Redundant memory mappings for fast access to large memories. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture, pp. 66–78. ACM, New York (2015)

    Google Scholar 

  16. Bhargava, R., Serebrin, B., Spadini, F., Manne, S.: Accelerating two-dimensional page walks for virtualized systems. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 26–35. ACM, New York (2008)

    Google Scholar 

  17. Gandhi, J., Basu, A., Hill, M.D., Swift, M.M.: Efficient memory virtualization: reducing dimensionality of nested page walks. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 178–189. IEEE Computer Society, Washington, DC (2014)

    Google Scholar 

  18. Yan, Z., Veselý, J., Cox, G., Bhattacharjee, A.: Hardware translation coherence for virtualized systems. In: Proceedings of the 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture, pp. 430–443. ACM, New York (2017)

    Google Scholar 

  19. Kandiraju, G.B., Sivasubramaniam, A.: Going the distance for TLB prefetching: an application-driven study. In: Proceedings of the 29th Annual International Symposium on Computer Architecture, pp. 195–206. IEEE, Anchorage (2002)

    Google Scholar 

  20. Saulsbury, A., Dahlgren, F., Stenström, P.: Recency-based TLB preloading, In: Proceedings of the 27th Annual International Symposium on Computer Architecture, pp. 117–127. ACM, New York (2000)

    Google Scholar 

  21. Yaniv, I., Tsafrir, D.: Hash, don’t cache (the page table). In: Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, pp. 337–350. ACM, New York (2016)

    Google Scholar 

  22. Stallings, W.: Operating Systems: Internals and Design Principles, 7th edn. Pearson/Prentice Hall, Upper Saddle River (2011)

    Google Scholar 

  23. Raoux, S., et al.: Phase-change random access memory: a scalable technology. IBM J. Res. Dev. 52(4.5), 465–479 (2008)

    Google Scholar 

  24. Park, H., Yoo, S., Lee, S.: Power management of hybrid DRAM/PRAM-based main memory. In: Proceedings of the 48th Design Automation Conference, pp. 59–64. ACM, New York (2011)

    Google Scholar 

  25. Wei, W., Jiang, D., McKee, S.A., Xiong, J., Chen, M.: Exploiting program semantics to place data in hybrid memory. In: Proceedings of the 2015 International Conference on Parallel Architecture and Compilation, pp. 163–173. IEEE Computer Society, Washington, DC (2015)

    Google Scholar 

  26. SPEC CPU2006. https://www.spec.org/cpu2006. Last Accessed 21 Nov 2019

  27. Parsec. http://parsec.cs.princeton.edu/index.htm. Last Accessed 21 Nov 2019

  28. Bailey, D., et al.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)

    Article  Google Scholar 

  29. Graph500. http://graph500.org/. Last Accessed 21 Nov 2019

  30. Jiang, X., et al.: CHOP: adaptive filter-based DRAM caching for CMP server platforms. In: Proceedings of the Sixteenth International Symposium on High-Performance Computer Architecture, pp. 1–12. IEEE Computer Society, Washington, DC (2010)

    Google Scholar 

  31. Sanchez, D., Kozyrakis, C.: ZSim: fast and accurate microarchitectural simulation of thousand-core systems. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, pp. 475–486. ACM, New York (2013)

    Google Scholar 

  32. Luk, C.K., et al.: Pin: Building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 190–200. ACM, New York (2005)

    Google Scholar 

  33. Poremba, M., Zhang, T., Xie, Y.: NVMain 2.0: a user-friendly memory simulator to model (non-)volatile memory systems. IEEE Comput. Archit. Lett. 14(2), 140–143 (2015)

    Google Scholar 

  34. Lee, B.C., Ipek, E., Mutlu, O., Burger, D.: Architecting phase change memory as a scalable DRAM alternative. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp. 2–13. ACM, New York (2009)

    Google Scholar 

  35. Wyhash. https://github.com/rurban/smhasher. Last Accessed 21 Nov 2019

  36. Gorman, M., Healy, P.: Supporting superpage allocation without additional hardware support. In: Proceedings of the 7th International Symposium on Memory Management, pp. 41–50. ACM, New York (2008)

    Google Scholar 

  37. Huge Pages Part 2 (Interfaces). https://lwn.net/Articles/375096/. Last Accessed 21 Nov 2019

  38. Barr, T.W., Cox, A.L., Rixner, S.: SpecTLB: a mechanism for speculative address translation. In: Proceedings of the 38th Annual International Symposium on Computer Architecture, pp. 307–318. ACM, New York (2011)

    Google Scholar 

  39. Papadopoulou, M.M., Tong, X., Seznec, A., Moshovos, A.: Prediction-based superpage-friendly TLB designs. In: Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture, pp. 210–222. IEEE Computer Society, Washington, DC (2015)

    Google Scholar 

  40. Du, Y., Zhou, M., Childers, B.R., Mossé, D., Melhem, R.: Supporting superpages in non-contiguous physical memory. In: Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture, pp. 223–234. IEEE Computer Society, Washington, DC (2015)

    Google Scholar 

  41. Corbet, J., Rubini, A., Kroah-Hartman, G.: Linux Device Drivers: Where the Kernel Meets the Hardware. 3rd edn. O’Reilly Media, Sebastopol (2005)

    Google Scholar 

  42. Wang, X., Liu, H., Liao, X., Jin, H., Zhang, Y.: TLB coalescing for multi-grained page migration in hybrid memory systems. IEEE Access 8, 66304–66314 (2020)

    Article  Google Scholar 

  43. Basu, A., Gandhi, J., Chang, J., Hill, M.D., Swift, M.M.: Efficient virtual memory for big memory servers. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, pp. 237–248. ACM, New York (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haikun Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Liu, H., Liao, X., Jin, H. (2020). Superpage-Friendly Page Table Design for Hybrid Memory Systems. In: Zeng, J., Jing, W., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2020. Communications in Computer and Information Science, vol 1257. Springer, Singapore. https://doi.org/10.1007/978-981-15-7981-3_46

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-7981-3_46

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-7980-6

  • Online ISBN: 978-981-15-7981-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics