GPU Erasure Coding for Campaign Storage

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10524)


High-performance computing (HPC) demands high bandwidth and low latency in I/O performance leading to the development of storage systems and I/O software components that strive to provide greater and greater performance. However, capital and energy budgets along with increasing storage capacity requirements have motivated the search for lower cost, large storage systems for HPC. With Burst Buffer technology increasing the bandwidth and reducing the latency for I/O between the compute and storage systems, the back-end storage bandwidth and latency requirements can be reduced, especially underneath an adequately sized modern parallel file system. Cloud computing has led to the development of large, low-cost storage solutions where design has focused on high capacity, availability, and low energy consumption at lowest cost. Cloud computing storage systems leverage duplicates and erasure coding technology to provide high availability at much lower cost than traditional HPC storage systems. Leveraging certain cloud storage infrastructure and concepts in HPC would be valuable economically in terms of cost-effective performance for certain storage tiers. To enable the use of cloud storage technologies for HPC we study the architecture for interfacing cloud storage between the HPC parallel file systems and the archive storage. In this paper, we report our comparison of two erasure coding implementations for the Ceph file system. We compare measurements of various degrees of sharding that are relevant for HPC applications. We show that the Gibraltar GPU Erasure coding library outperforms a CPU implementation of an erasure coding plugin for the Ceph object storage system, opening the potential for new ways to architect such storage systems based on Ceph.


Erasure Codes Store Running Cloud Storage Technology Gibraltar Parallel File System (PFS) 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This material is based upon work supported by the National Science Foundation under Grants Nos. ACI-1541310, CNS-0821497 and CNS-1229282. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

This material is based upon work supported by Sandia National Laboratories. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.


  1. 1.
    Braam, P.J., Schwan, P.: Lustre: the intergalactic file system. In: Ottawa Linux Symposium, p. 50 (2002)Google Scholar
  2. 2.
    Corbett, P.F., Feitelson, D.G., Prost, J.P., Almasi, G.S., Baylor, S.J., Bolmarcich, A.S., Hsu, Y., Satran, J., Snir, M., Colao, R., Herr, B.D., Kavaky, J., Morgan, T.R., Zlotek, A.: Parallel file systems for the IBM SP computers. IBM Syst. J. 34(2), 222–248 (1995),
  3. 3.
    Curry, M.L., Skjellum, A., Lee Ward, H., Brightwell, R.: Accelerating reed-solomon coding in RAID systems with GPUs. In: Proceedings of the 2008 IEEE International Parallel & Distributed Processing Symposium, pp. 1–6. IEEE, Miami, April 2008,
  4. 4.
    Curry, M.L., Skjellum, A., Lee Ward, H., Brightwell, R.: Gibraltar: a reed-solomon coding library for storage applications on programmable graphics processors. Concurrency Comput. Pract. Exp. 23(18), 2477–2495, December 2011,
  5. 5.
    Curry, M.L., Ward, H.L., Skjellum, A., Brightwell, R.: A lightweight, GPU-based software RAID system. In: 2010 39th International Conference on Parallel Processing, pp. 565–572. IEEE, San Diego, September 2010,
  6. 6.
    Curry, M.L.: A highly reliable GPU-based RAID system. Ph.D. thesis, University of Alabama at Birmingham (2010),
  7. 7.
    Dachary, L.: Ceph Replication vs Erasure Coding, July 2013,
  8. 8.
  9. 9.
    Farrance, R.: Timeline: 50 Years of Hard Drives, September 2006,
  10. 10.
    Grider, G.: HPC Storage and IO Trends and Workflows, April 2016,
  11. 11.
    Greenan, K.: Reliability and Power-Efficiency in Erasure-Coded Storage Systems. Tech. Rep. UCSC-SSRC-09-08, University of California, Santa Cruz, December 2009Google Scholar
  12. 12.
  13. 13.
    Haddock, W., Curry, M.L., Bangalore, P., Skjellum, A.: Using GPU erasure coding to lower HPC pre-archive storage costs. In: TBD (2017)Google Scholar
  14. 14.
    Hafner, J.L., Deenadhayalan, V., Kanungo, T., Rao, K.: Performance metrics for erasure codes in storage systems. IBM Res. Rep. RJ 10321 (2004)Google Scholar
  15. 15.
    Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., Yekhanin, S.: Erasure coding in windows azure storage. In: Usenix Annual Technical Conference, pp. 15–26, Boston, MA (2012)Google Scholar
  16. 16.
    Inman, J., Grider, G., Chen, H.B.: Cost of tape versus disk for archival storage. In: 2014 IEEE 7th International Conference on Cloud Computing, pp. 208–215. IEEE, Anchorage, June 2014,
  17. 17.
    Plank, J.S., Simmerman, S., Schuman, C.D.: Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications. Tech. Rep. Technical Report CS-08-627, University of Tennessee, Knoxville, TN 37996 (2008),
  18. 18.
    Khasymski, A., Rafique, M.M., Butt, A.R., Vazhkudai, S.S., Nikolopoulos, D.S.: On the use of GPUs in realizing cost-effective distributed RAID. In: 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 469–478, August 2012,
  19. 19.
    Lamb, K.: Trinity Campaign Storage and Usage Model, August 2015,
  20. 20.
    Li, M., Shu, J.: DACO: a high-performance disk architecture designed specially for large-scale erasure-coded storage systems. IEEE Trans. Comput. 59(10), 1350–1362 (2010)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Messina, P.: A Path to Capable Exascale Computing, July 2016,
  22. 22.
    Miyamae, T., Nakao, T., Shiozawa, K.: Erasure code with shingled local parity groups for efficient recovery from multiple disk failures. In: 10th Workshop on Hot Topics in System Dependability (HotDep 2014). USENIX Association, Broomfield, CO, October 2014,
  23. 23.
  24. 24.
  25. 25.
    NVIDIA: CUDA Parallel Computing Platform, March 2017,
  26. 26.
    Papailiopoulos, D.S., Dimakis, A.G.: Locally repairable codes. IEEE Trans. Inf. Theory 60(10), 5843–5855 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (RAID). In: Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pp. 109–116. SIGMOD 1988, NY, USA (1988),
  28. 28.
    Plank, J.S.: A tutorial on reed-solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exp. 27(9), 995–1012 (1997)CrossRefGoogle Scholar
  29. 29.
    Plank, J.S., Blaum, M., Hafner, J.L.: SD codes: erasure codes designed for how storage systems really fail. In: FAST, pp. 95–104. San Jose, CA, USA, February 2013Google Scholar
  30. 30.
    Plank, J.S., Thomason, M.G.: A practical analysis of low-density parity-check erasure codes for wide-area storage applications. In: 2004 International Conference on Dependable Systems and Networks, pp. 115–124. IEEE (2004)Google Scholar
  31. 31.
    Rashmi, K., Shah, N.B., Gu, D., Kuang, H., Borthakur, D., Ramchandran, K.: A “hitchhiker’s” guide to fast and efficient data reconstruction in erasure-coded data centers. In: Proceedings of the 2014 ACM Conference on SIGCOMM, vol. 44, pp. 331–342. ACM Press, Chicago (2014),
  32. 32.
    Rodrigues, R., Liskov, B.: High availability in DHTs: erasure coding vs. replication. In: Castro, M., van Renesse, R. (eds.) IPTPS 2005. LNCS, vol. 3640, pp. 226–239. Springer, Heidelberg (2005). doi: 10.1007/11558989_21 CrossRefGoogle Scholar
  33. 33.
    Saito, Y., Frlund, S., Veitch, A., Merchant, A., Spence, S.: FAB: building distributed enterprise disk arrays from commodity components. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XI, pp. 48–58. ACM Press, Boston (2004),
  34. 34.
    Shilov, A.: Seagate Unveils 10 TB Helium filled Hard Disk Drive, January 2016,
  35. 35.
    Tucker, G.: ISA-L open source v2.14 API doc, April 2016,
  36. 36.
    Weatherspoon, H., Kubiatowicz, J.D.: Erasure coding vs. replication: a quantitative comparison. In: Druschel, P., Kaashoek, F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, pp. 328–337. Springer, Heidelberg (2002). doi: 10.1007/3-540-45748-8_31 CrossRefGoogle Scholar
  37. 37.
    Weil, S., Brandt, S., Miller, E., Maltzahn, C.: CRUSH: controlled, scalable, decentralized placement of replicated data. In: Proceedings of the ACM/IEEE on SC 2006 Conference, pp. 31–31. IEEE, Tampa, November 2006,
  38. 38.
    Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 307–320. USENIX Association (2006)Google Scholar
  39. 39.
    Weil, S.A., Leung, A.W., Brandt, S.A., Maltzahn, C.: RADOS: a scalable, reliable storage service for petabyte-scale storage clusters. In: Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing 2007 (PDSW 2007), p. 35. ACM Press, Reno (2007),

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer and Information SciencesUniversity of Alabama at BirminghamBirminghamUSA
  2. 2.Center for Computing ResearchSandia National LaboratoriesAlbuquerqueUSA
  3. 3.Department of Computer Science and Engineering and McCrary Institute for Critical Infrastructure Protection and Cyber SystemsAuburn UniversityAuburnUSA

Personalised recommendations