Advertisement

Knowledge and Information Systems

, Volume 41, Issue 2, pp 335–354 | Cite as

A hybrid memory built by SSD and DRAM to support in-memory Big Data analytics

  • Zhiguang Chen
  • Yutong Lu
  • Nong Xiao
  • Fang Liu
Regular Paper

Abstract

Big Data requires a shift in traditional computing architecture. The in-memory computing is a new paradigm for Big Data analytics. However, DRAM-based main memory is neither cost-effective nor energy-effective. This work combines flash-based solid state drive (SSD) and DRAM together to build a hybrid memory, which meets both of the two requirements. As the latency of SSD is much higher than that of DRAM, the hybrid architecture should guarantee that most requests are served by DRAM rather than by SSD. Accordingly, we take two measures to enhance the hit ratio of DRAM. First, the hybrid memory employs an adaptive prefetching mechanism to guarantee that data have already been prepared in DRAM before they are demanded. Second, the DRAM employs a novel replacement policy to give higher priority to replace data that are easy to be prefetched because these data can be served by prefetching once they are demanded once again. On the contrary, the data that are hard to be prefetched are protected by DRAM. The prefetching mechanism and replacement policy employed by the hybrid memory rely on access patterns of files. So, we propose a novel pattern recognition method by improving the LZ data compression algorithm to detect access patterns. We evaluate our proposals via prototype and trace-driven simulations. Experimental results demonstrate that the hybrid memory is able to extend the DRAM by more than twice.

Keywords

Hybrid memory SSD Big Data In-memory computing Prefetch Pattern recognition 

Notes

Acknowledgments

We are grateful to our anonymous reviewers for their suggestions to improve this paper. This work is supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013201, the National Natural Science Foundation of China under Grant Nos. 61025009, 61120106005, 61232003, 61170288, 61379145, and 61332003.

References

  1. 1.
    Zikopoulos P, Eaton C, Zikopoulos P (2011) Understanding Big Data: analytics for enterprise class Hadoop and streaming data. 19 Oct 2011. http://public.dhe.ibm.com/common/ssi/ecm/en/iml14296usen/IML14296USEN.PDF. Accessed 22 Dec 2013
  2. 2.
    Villars RL, Olofson CW, Eastwood M (2011) Big Data: what it is and why you should care. http://sites.amd.com/us/Documents/IDC_AMD_Big_Data_Whitepaper.pdf. Accessed 22 Dec 2013
  3. 3.
    Shinnar A, Cunningham D, Herta B, Saraswat V (2012) M3R: increased performance for in-memory Hadoop jobs. In: Proceedings of the VLDB endowment, vol 5, no. 4, Istanbul, Turkey, August 27–31Google Scholar
  4. 4.
    Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on networked systems design and implementation, San Jose, CA, April 25–27Google Scholar
  5. 5.
    Pirk H, Funke F, Grund M, Neumann T, Leser U, Manegold S, Kemper A, Kersten M (2013) CPU and cache efficient management of memory-resident databases. In: Proceedings of ICDE2013, Brisbane, Australia, April 8–12Google Scholar
  6. 6.
    Larson P-A, Blanas S, Diaconu C, Freedman C, Patel JM, Zwilling M (2012) High-performance concurrency control mechanisms for main-memory databases. In: Proceedings of the VLDB endowment, vol 5, no. 10, Istanbul, Turkey, Aug 27–31Google Scholar
  7. 7.
    Levandoski J, Larson P, Stoica R (2013) Identifying hot and cold data in main-memory databases. In: Proceedings of ICDE2013, Brisbane, Australia, April 8–12Google Scholar
  8. 8.
    Albutiu M-C, Kemper A, Neumann T (2012) Massively parallel sortmerge joins in main memory multicore database systems. In: Proceedings of the VLDB Endowment, vol 5, no. 4, Istanbul, Turkey, Aug 27–31Google Scholar
  9. 9.
    Kgil T, Roberts D, Mudge T (2008) Improving NAND flash based disk caches. In: Proceedings of the 35th international symposium on computer, architecture, pp 327–338, June 21–25Google Scholar
  10. 10.
    Kgil T, Mudge T (2006) FlashCache: a NAND flash memory file cache for low power web servers. In: Proceedings of the 2006 international conference on compilers, architecture and synthesis for embedded systems, Seoul, Korea, Oct 22–25Google Scholar
  11. 11.
    Wu X, Li J, Zhang L, Speight E, Rajamony R, Xie Y (2009) Hybrid cache architecture with disparate memory technologies. In: Proceedings of 36th annual international symposium computer architecture (ISCA 09), pp 34-45Google Scholar
  12. 12.
    Dhiman G, Ayoub R, Rosing T (2009) PDRAM: a hybrid PRAM and DRAM main memory system. In: Proceedings of design automation conference, pp 664–669, July 26–31Google Scholar
  13. 13.
    Qureshi MK, Srinivasan V, Rivers JA (2009) Scalable high performance main memory system using phase-change memory technology. In: Proceedings of 36th annual international symposium computer architecture (ISCA 09), pp 24–33Google Scholar
  14. 14.
    Liang S, Song J, Zhang X (2007) STEP: sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In: Proceedings of 27th international conference on distributed, computing systems, June 25–27Google Scholar
  15. 15.
    Gill BS, Modha DS (2005) Sarc: sequential prefetching in adaptive replacement cache. In: Proceedings of the general track: USENIX 2005 annual technical conference (USENIX), pp 293–308Google Scholar
  16. 16.
    Gill BS, Bathen LAD (2007) AMP: adaptive multi-stream prefetching in a shared cache. In: Proceedings of the fifth USENIX symposium on file and storage technologies, pp 185–198, San Jose, CAGoogle Scholar
  17. 17.
    Xiao N, Chen ZG, Liu F, Lai MC, An LF (2011) P3Stor: a parallel, durable flash-based SSD for enterprise-scale storage systems. Sci China Inf Sci 54:1129–1141CrossRefGoogle Scholar
  18. 18.
    Cao P, Felten EW, Karlin AR, Li K (1996) Implementation and performance of integrated applica-tion-controlled file caching, prefetching and disk scheduling. ACM Trans Comput Syst 14(4):311–343CrossRefGoogle Scholar
  19. 19.
    Choi J, Noh SH, Min SL et al (2000) Towards application/file-level characterization of block references: a case for fine-grained buffer management. In: Proceedings of the 2000 ACM SIGMETRICS international conference on measurement and modeling of computer systems 2000. Santa Clara, CA, USGoogle Scholar
  20. 20.
    Ziv j, Lempel A (1978) Compression of individual sequences via variable-rate coding. IEEE Trans Inf Theory 24(5):530–536MathSciNetCrossRefMATHGoogle Scholar
  21. 21.
    Kang WH, Lee SW, Moon B (2012) Flash-based extended cache for higher throughput and faster recovery. Proc VLDB Endow 5(11):1615–1626CrossRefGoogle Scholar
  22. 22.
    Gniady C, Butt AR, Hu YC (2004) Program-counter-based pattern classification in buffer caching. In: Proceedings of the 6th conference on symposium on operating systems design and implementation. San Francisco, CA, Dec 6–8Google Scholar
  23. 23.
    Patterson RH, Gibson GA, Ginting E, Stodolsky D, Zelenka J (1995) Informed prefetching and caching. In: Proceedings of the 15th ACM symposium on operating systems principles (SOSP), pp 79–95Google Scholar
  24. 24.
    Li Z, Chen Z, Srinivasan SM, Zhou Y (2004) C-miner: mining block correlations in storage systems. In: Proceedings of the 3rd USENIX conference on file and storage technologies (FAST), pp 173–186Google Scholar
  25. 25.
    Curewitz KM, Krishnan P, Vitter JS (1993) Practical prefetching via data compression. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 257–266, Washington, DC, US, May 25–28Google Scholar
  26. 26.
    Vitter JS, Krishnan P (1991) Optimal prefetching via data compression. In: Proceedings of the 32nd annual IEEE symposium on foundations of computer science, OctoberGoogle Scholar
  27. 27.
    Uppal AJ, Chiang RC, Huang HH (2012) Flashy prefetching for high-performance flash drives. In: Proceedings of 28th symposium on mass storage systems and technologies (MSST), April 16–20Google Scholar
  28. 28.
    SNIA Block Traces. http://iotta.snia.org/traces. Accessed 3 May 2013
  29. 29.
    Narayanan D, Donnelly A, Rowstron A (2008) Write off-loading: practical power management for enterprise storage. In: Proceedings of the 6th USENIX conference on file and storage technologies, pp 253–267, San Jose, CA, USA, Feb 26–29Google Scholar
  30. 30.
    Ramaxel, IO Edge 400GB PCIE2.0 Flash Card. http://www.ramaxel.com. Accessed 10 July 2013

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Zhiguang Chen
    • 1
    • 2
  • Yutong Lu
    • 1
    • 2
  • Nong Xiao
    • 1
    • 2
  • Fang Liu
    • 1
    • 2
  1. 1.State Key Laboratory of High Performance ComputingNational University of Defense TechnologyChangshaPeople’s Republic of China
  2. 2.College of ComputerNational University of Defense TechnologyChangshaPeople’s Republic of China

Personalised recommendations