Skip to main content

Advertisement

Log in

ColdStore: A Storage System for Archival Data

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

With the astounding growth of the information era, the amount of data increases at high speed. However, a significant fraction of the data residing in storage is rarely accessed, which is referred to as cold data. It has become a looming problem of how to optimally manage and maintain this data at a low cost. In this paper, we present a large-scale, reliable, energy- and cost-efficient storage system called ColdStore. Different from traditional file systems, ColdStore aims to store cold data efficiently in the aspects of energy and cost. It is designed as a high-available distributed system, which is a cluster composed of a number of nodes with different roles: Metadata Nodes maintain the metadata of all files; Transfer Nodes play a role in encoding, decoding and caching; and Storage Nodes store encoded data for durability. Dedicated hardware is designed for Storage Nodes to reduce the power consumption. Each Storage Node is equipped with a low-power CPU and has a power supply control unit of hard disk drives. Due to the architecture we proposed, the vast majority, as many as 93.75%, of the hard disk drives can be powered off under the archival workload, while the system still provides reasonable performance and high reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The Common Library defines data structures and functions widely utilized by all components in ColdStore.

  2. The Erlang Runtime has all the code necessary to run the Erlang node: file servers, code servers, and so on.

  3. Though the Transfer Node caches data as many as possible, data that is least recently used may have to be discarded due to the limited disk space of Transfer Nodes.

  4. The operation PopFirst() removes and returns the first record of the list. Since all records are ordered in the database, accessing the first record of the list is efficient.

References

  1. Amur, H., Cipar, J., Gupta, V., Ganger, G. R., Kozuch, M. A., & Schwan, K. (2010). Robust and flexible power-proportional storage. In Proceedings of the 1st ACM symposium on cloud computing (pp. 217–228). ACM.

  2. Armstrong, J. (1997). The development of Erlang. In ACM SIGPLAN notices (Vol. 32, pp. 196–203). ACM.

  3. Balakrishnan, S., Black, R., Donnelly, A., England, P., Glass, A., Harper, D., Legtchenko, S., Ogus, A., Peterson, E., & Rowstron, A. I. (2014). Pelican: A building block for exascale cold data storage. In OSDI (pp. 351–365).

  4. Borthakur, D. (2007). The hadoop distributed file system: Architecture and design. Hadoop Project Website, 11(2007), 21.

    Google Scholar 

  5. Colarelli, D., & Grunwald, D. (2002). Massive arrays of idle disks for storage archives. In Supercomputing, ACM/IEEE 2002 conference (pp. 47–47). IEEE.

  6. Free space bitmap. Retrieved May 13, 2018, from https://en.wikipedia.org/wiki/Free_space_bitmap.

  7. Fielding, R. T., & Taylor, R. N. (2000). Architectural styles and the design of network-based software architectures (Vol. 7). Doctoral dissertation, University of California, Irvine.

  8. Gantz, J., & Reinsel, D. (2012). The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east. Retrieved May 13, 2018, from http://www.emc.com/collateral/analyst-reports/idc-the-digital-universein-2020.pdf.

  9. Ghemawat, S., & Dean, J. (2014). Leveldb, a fast and lightweight key/value database library by Google. Retrieved May 13, 2018, from https://github.com/google/leveldb.

  10. Ghemawat, S., Gobioff, H., & Leung, S. T. (2003). The Google file system (Vol. 37). New York: ACM.

    Google Scholar 

  11. Google: Protocol buffers. Retrieved May 13, 2018, from https://developers.google.cn/protocol-buffers.

  12. Guha, A. (2006). Solving the energy crisis in the data center using COPAN systems enhanced maid storage platform. Copan Systems white paper

  13. Haeberlen, A., Mislove, A., & Druschel, P. (2005). Glacier: Highly durable, decentralized storage despite massive correlated failures. In Proceedings of the 2nd conference on symposium on networked systems design & implementation (Vol. 2, pp. 143–158). USENIX Association.

  14. Hagmann, R. (1987). Reimplementing the cedar file system using logging and group commit. SIGOPS Operating Systems Review, 21(5), 155–162. https://doi.org/10.1145/37499.37518.

    Article  Google Scholar 

  15. Howard, J. H., Kazar, M. L., Menees, S. G., Nichols, D. A., Satyanarayanan, M., Sidebotham, R. N., et al. (1988). Scale and performance in a distributed file system. ACM Transactions on Computer Systems (TOCS), 6(1), 51–81.

    Article  Google Scholar 

  16. Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., et al. (2012). Erasure coding in windows azure storage. In Usenix annual technical conference (pp. 15–26). Boston, MA.

  17. Johnson, T. L., Merten, M. C., & Hwu, W. M. W. (1997). Run-time spatial locality detection and optimization. In Proceedings of the 30th annual ACM/IEEE international symposium on microarchitecture (pp. 57–64). IEEE Computer Society.

  18. Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., & Lewin, D. (1997). Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing (pp. 654–663). ACM.

  19. Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., et al. (2000). Oceanstore: An architecture for global-scale persistent storage. ACM SIGOPS Operating Systems Review, 34(5), 190–201.

    Article  Google Scholar 

  20. Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L., & Williams, M. (1991). Replication in the harp file system. SIGOPS Operating Systems Review, 25(5), 226–238. https://doi.org/10.1145/121133.121169.

    Article  Google Scholar 

  21. Macko, P., Ge, X., Kelley, J., Slik, D., et al. (2017). Smore: A cold data object store for SMR drives. In Proceedings of the 33rd international conference on massive storage systems and technology (MSST’17)

  22. Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., & Vivier, L. (2007). The new ext4 filesystem: Current status and future plans. Proceedings of the Linux Symposium, 2, 21–33.

    Google Scholar 

  23. Mattsson, H., Nilsson, H., & Wikström, C. (1999). Mnesia: A distributed robust dbms for telecommunications applications. In International symposium on practical aspects of declarative languages (pp. 152–163). Springer.

  24. Muralidhar, S., Lloyd, W., Roy, S., Hill, C., Lin, E., Liu, W., et al. (2014). f4: Facebook’s warm blob storage system. In Proceedings of the 11th USENIX conference on operating systems design and implementation (pp. 383–398). USENIX Association.

  25. Nagar, R. (1997). Windows NT file system internals: A developer’s guide (Vol. 786). Sebastopol: O’Reilly.

    MATH  Google Scholar 

  26. Nelson, M. N., Welch, B. B., & Ousterhout, J. K. (1988). Caching in the sprite network file system. ACM Transactions on Computer Systems (TOCS), 6(1), 134–154.

    Article  Google Scholar 

  27. O’neil, E. J., O’neil, P. E., & Weikum, G. (1993). The LRU-K page replacement algorithm for database disk buffering. ACM SIGMOD Record, 22(2), 297–306.

    Article  Google Scholar 

  28. Pinheiro, E., & Bianchini, R. (2004). Energy conservation techniques for disk array-based servers. In Proceedings of the 18th annual international conference on Supercomputing (pp. 68–78). ACM.

  29. Pinheiro, E., Bianchini, R., & Dubnicki, C. (2006). Exploiting redundancy to conserve energy in storage systems. In ACM SIGMETRICS performance evaluation review (Vol. 34, pp. 15–26). ACM.

  30. Plank, J. S. (2013). Erasure codes for storage systems: A brief primer. The Usenix Magazine, 38(6), 44–50.

    Google Scholar 

  31. Plank, J. S., Greenan, K. M., & Miller, E. L. (2013). Screaming fast Galois field arithmetic using intel SIMD instructions. In: FAST (pp. 299–306).

  32. Quinlan, S., & Dorward, S. (2002). Venti: A new approach to archival storage. In Proceedings of the conference on file and storage technologies, FAST ’02 (pp. 89–101). USENIX Association. Retrieved May 13, 2018, from http://dl.acm.org/citation.cfm?id=645371.651321.

  33. Reed, I. S., & Solomon, G. (1960). Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2), 300–304.

    Article  MathSciNet  Google Scholar 

  34. Reschke, J. (2015). The ‘Basic’ HTTP authentication scheme. RFC 7617. Retrieved Dec 9, 2019, from https://tools.ietf.org/pdf/rfc7617.pdf.

  35. Roselli, D. S., Lorch, J. R., Anderson, T. E., et al. (2000). A comparison of file system workloads. In USENIX annual technical conference, general track (pp. 41–54).

  36. Ross, R., & Latham, R. (2006). Pvfs: A parallel file system. In Proceedings of the 2006 ACM/IEEE conference on supercomputing (p. 34). ACM.

  37. Secure hash algorithm 2. Retrieved May 13, 2018, from https://en.wikipedia.org/wiki/SHA-2.

  38. Sathiamoorthy, M., Asteris, M., Papailiopoulos, D., Dimakis, A. G., Vadali, R., Chen, S., et al. (2013). Xoring elephants: Novel erasure codes for big data. In Proceedings of the VLDB endowment (Vol. 6, pp. 325–336). VLDB Endowment.

  39. Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010). The hadoop distributed file system. In IEEE 26th symposium on mass storage systems and technologies (MSST), 2010. IEEE.

  40. Storer, M. W., Greenan, K. M., Miller, E. L., & Voruganti, K. (2008). Pergamum: Replacing tape with energy efficient, reliable, disk-based archival storage. In Proceedings of the 6th USENIX conference on file and storage technologies (p. 1). USENIX Association.

  41. Szeredi, M. (2018). Fuse: Filesystem in userspace. Retrieved May 13, 2018, from https://github.com/libfuse/libfuse.

  42. Tanenbaum, A. S. (2009). Modern operating system. London: Pearson Education Inc.

    MATH  Google Scholar 

  43. Thereska, E., Donnelly, A., & Narayanan, D. (2011). Sierra: Practical power-proportionality for data center storage. In Proceedings of the sixth conference on computer systems (pp. 169–182). ACM.

  44. Verma, A., & Venkataraman, S. (2010). Efficient metadata management for cloud computing applications. Technical report.

  45. Weddle, C., Oldham, M., Qian, J., Wang, A. I. A., Reiher, P., & Kuenning, G. (2007). Paraid: A gear-shifting power-aware raid. ACM Transactions on Storage, 3(3), 13.

    Article  Google Scholar 

  46. Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D., & Maltzahn, C. (2006). Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th symposium on operating systems design and implementation (pp. 307–320). USENIX Association.

  47. Yao, X., & Wang, J. (2006). Rimac: A novel redundancy-based hierarchical cache architecture for energy efficient, high performance storage systems. In ACM SIGOPS operating systems review (Vol. 40, pp. 249–262). ACM.

  48. Zhu, Q., Chen, Z., Tan, L., Zhou, Y., Keeton, K., & Wilkes, J. (2005). Hibernator: Helping disk arrays sleep through the winter. In ACM SIGOPS operating systems review (Vol. 39, pp. 177–190). ACM.

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (61671078, 61701031), Director Funds of Beijing Key Laboratory of Network System Architecture and Convergence (2017BKL-NSAC-ZJ-06), and 111 Project of China (B08004, B17007). This work is conducted on the platform of the Center for Data Science of Beijing University of Posts and Telecommunications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chi, M., Liu, J. & Yang, J. ColdStore: A Storage System for Archival Data. Wireless Pers Commun 111, 2325–2351 (2020). https://doi.org/10.1007/s11277-019-06989-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-019-06989-5

Keywords

Navigation