Skip to main content

A Light-Weight Hot Data Identification Scheme via Grouping-based LRU Lists

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9531))

Abstract

Real-world workloads generally exhibit high skewness in access patterns, and it is a consensus that separating hot and cold data may greatly improve storage system performance such as Solid State Drive(SSD) garbage collection(GC) performance. To achieve this, the key issue is how to accurately identify hot data, which is really challenging due to the large diversity and dynamics of workloads. In this paper, we propose a light-weight and high-accuracy identification scheme, which is developed via a group of Least Recently Used (LRU) lists and requires only a small amount of memory and CPU cycles. We further deploy our scheme on SSDs with DiskSim simulator, and results show that comparing to two state-of-the-art identification schemes, our scheme further reduces SSD GC cost by up to 59.1 % (62.1 %), and saves 44.3 % (77.5 %) of computational cost. Due to the light-weight and parameter-insensitive feature, our scheme can be easily deployed at various system levels and adaptable to different workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gomez, M.E., Santonja, V.: Characterizing temporal locality in I/O workload. In: Proeedings of SPECTS (2002)

    Google Scholar 

  2. Lee, S.W., Moon, B.: Design of flash-based DBMS: an in-page logging approach. In: Proceedings of the 2007 ACM SIGMOD (2007)

    Google Scholar 

  3. Roselli, D.S., Lorch, J.R., Anderson, T.E., et al.: A comparison of file system workloads. In: Proceedings of USENIX ATC, General Track (2000)

    Google Scholar 

  4. Hsieh, J.W., Kuo, T.W., Chang, L.P.: Efficient identification of hot data for flash memory storage systems. ACM TOS 2(1), 22–40 (2006)

    Article  Google Scholar 

  5. Miranda, A., Cortes, T.: CRAID: online RAID upgrades using dynamic hot data reorganization. In: Proceedings of USENIX FAST (2014)

    Google Scholar 

  6. Lee, H.S., Yun, H.S., Lee, D.H.: HFTL: hybrid flash translation layer based on hot data identification for flash memory. IEEE Trans. Consum. Electron. 55(4), 2005–2011 (2009)

    Article  Google Scholar 

  7. Li, Y., Lee, P.P., Lui, J.C., Xu, Y.: Impact of data locality on garbage collection in SSDs: a general analytical study. In: Proceedings of ACM/SPEC ICPE (2015)

    Google Scholar 

  8. Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM TOCS 10(1), 26–52 (1992)

    Article  Google Scholar 

  9. Chiang, M.L., Lee, P.C., Chang, R.C.: Managing flash memory in personal communication devices. In: Proceedings of ISCE (1997)

    Google Scholar 

  10. Chang, L.P., Kuo, T.W.: An adaptive striping architecture for flash memory storage systems of embedded systems. In: Proceedings of RTAS (2002)

    Google Scholar 

  11. Park, D., Du, D.H.: Hot data identification for flash-based storage systems using multiple bloom filters. In: Proceedings of MSST (2011)

    Google Scholar 

  12. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  13. Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J.D., Manasse, M.S., Panigrahy, R.: Design tradeoffs for SSD performance. In: Proceedings of USENIX ATC (2008)

    Google Scholar 

  14. John, B., Jiri, S., Steve, S., Greg, G.: The Disksim simulation environment (v4.0) (2008). http://www.pdl.cmu.edu/DiskSim/

  15. Van Houdt, B.: Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data. Perform. Eval. 70(10), 692–703 (2013)

    Article  Google Scholar 

  16. Yang, Y., Zhu, J.: Analytical modeling of garbage collection algorithms in hotness-aware flash-based solid state drives. In: Proceedings of MSST (2014)

    Google Scholar 

  17. Storage Performance Council (2002).http://traces.cs.umass.edu/index.php/Storage/Storage

  18. Verma, A., Koller, R., Useche, L., Rangaswami, R.: SRCMap: energy proportional storage using dynamic consolidation. In: Proceedings of USENIX FAST, vol. 10, pp. 267–280 (2010)

    Google Scholar 

Download references

Acknowledgments

This work is supported in part by National Nature Science Foundation of China under Grant No. 61379038 and No. 61303048, Anhui Provincial Natural Science Foundation under Grant No. 1508085SQF214, and Guangdong Key Laboratory of Popular High Performance Computers and Shenzhen Key Laboratory of Service Computing and Applications under Grant No. SZUGDPHPCL2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongkun Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Shen, B., Li, Y., Xu, Y., Pan, Y. (2015). A Light-Weight Hot Data Identification Scheme via Grouping-based LRU Lists. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9531. Springer, Cham. https://doi.org/10.1007/978-3-319-27140-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27140-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27139-2

  • Online ISBN: 978-3-319-27140-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics