Skip to main content

OptRS: An Optimized Algorithm Based on CRS Codes in Big Data Storage Systems

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9528))

  • 1734 Accesses

Abstract

It is well-known that erasure codes, such as Reed-Solomon (RS) and Cauchy RS (CRS) codes, have played an important roles in big data storage systems to both industry and academia. While RS and CRS codes provide significant saving in storage space, they can impose a huge burden of systems performance while encoding and decoding. By studying existing high reliability and space saving rate of coding technologies, it is urgent to deploy an efficient erasure coding mechanism into distributed storage systems, which is the main storage architecture in big data era.This paper puts forward an optimized algorithm named OptRS (Optimized RS), which can not only guarantee the system’s reliability, but also enhance the efficiency and utilization of storage space. The dominant type of encoding and decoding inside erasure codes is matrix computation. In order to accelerate the speed of calculation, OptRS transferred the computation of matrix Galois field mapping into the XOR operation. Additionally, OptRS has developed the elimination schemes to minimize the numbers of XOR. Through theory analysis, we can conclude that OptRS algorithm improved the performance of encoding and decoding lead to shorten the computation time the same as verified by the test. The encoding efficiency with OptRS coding achieves 36.1 % and 58.2 % acceleration than using CRS and RS coding, respectively. The decoding rate by using OptRS can increase 19.3 % and 33.1 % compared with CRS and RS averagely by quantitative studying.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schmuh, F., Haskin, R.: GPFS: A shareddisk file system for large computing clusters. In: Proceedings of the 1st USENIX Conference on File and Storage Technologies (2002), Monterey, CA, USA (2002)

    Google Scholar 

  2. Ghemawat, S., Gobioff, H., Leung, S.-T.: The google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP 2003, pp. 29–43 (2003)

    Google Scholar 

  3. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Proceedings of IEEE MSST 2010, Incline Village, NV, USA, May 2010

    Google Scholar 

  4. Amazon Simple Storage Service (S3). http://www.amazon.com/s3

  5. Weil, S.A., Brandt, S.A., Miller, E.L., et al.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Conference on Operating Systems Design and Implementation (2006)

    Google Scholar 

  6. Reed, I.S., Solmon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  7. Colossus, successor to Google File System. http://static.googleusercontent.com/media/research.google.com/en/us/university/relations/facultysummit2010/storage_architecture_and_challenges.pdf/

  8. Huang, C., Simitci, H., Xu, Y. et al.: Erasure coding in Windows AzureStorage. In: USENIX Annual Technical Conference (ATC) (June 2012), boston, MA,USA (2012)

    Google Scholar 

  9. Facebooks approach to big data storage challenge. http://www.slideshare.net/Hadoop_Summit/facebooks-approachto-big-data-storage-challenge

  10. Blomer, J., Kalfane, M., Karpinski, M., et al.: An XOR-based erasure-resilient coding scheme. Technical Report TR-95-048, International Computer Science Institute, August 1995

    Google Scholar 

  11. DeCandia, G., Hastorun, D., Jampani, M., et al.: Dynamo: amazon’s highly available key-value store. In: ACM SIGOPS Operating Systems Review, Vol. 41(6), pp. 205–220. ACM (2007)

    Google Scholar 

  12. An introduction to GPFS version 3.5. http://www-03.ibm.com/systems/resources/introduction-to-gpfs-3-5.pdf

  13. Facebooks erasure coded hadoop distributed file system (HDFS-RAID). https://github.com/facebook/hadoop-20

  14. Yin, C., Xie, C., Wan, J., et al.: BMCloud: Minimizing repair bandwidth and maintenance cost in cloud storage. In: Mathematical Problems in Engineering (2013)

    Google Scholar 

  15. Plank, J.S., Greenan, K.M., Miller, E.L.: Screaming fast Galois Field arithmetic using Intel SIMD instructions. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies (2013), San Jose, CA, USA (2013)

    Google Scholar 

  16. Rashmi, K.V., Shan, N.B., Gu, D., et al.: A hitchhikers guide to fast and efficient data reconstruction in erasure-coded data centers. In: Proceedings of ACM SIGCOMM14, SIGCOMM (2014)

    Google Scholar 

  17. Yin, C., Wang, J., Xie, C., et al.: Robot: an efficient model for big data storage systems based on erasure coding. In: Proceedings of the IEEE International Conference on Big Data, Santa Clara, CA, USA (2013)

    Google Scholar 

  18. Khan, O., Burns, R., Plank, J., et al.: Rethinking eerasure codes for cloud file systems: minimizing I/O for recovery and degraded reads. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, San Jose, CA, USA (2012)

    Google Scholar 

  19. Xia, M., Saxena, M., Blaum, M., et al.: A tale of two erasure codes in HDFS. In: the Proceedings of the 13th USENIX Conference on File and Storage Technologies, Santa Clara, CA, USA (2015)

    Google Scholar 

  20. Tamo, I., Barg, A.: A family of optimal locally recoverable codes. IEEE Trans. Inf. Theor. 60(8), 4661–4676 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  21. Rashmi, K.V., Nakkiran, P., Wang, J., et al.: Having your cake and eating it too: jointly optimal erasure codes for I/O, storage, and network-bandwidth. In: The Proceedings of the 13th USENIX Conference on File and Storage Technologies, Santa Clara, CA, USA (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianzong Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yin, C. et al. (2015). OptRS: An Optimized Algorithm Based on CRS Codes in Big Data Storage Systems. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9528. Springer, Cham. https://doi.org/10.1007/978-3-319-27119-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27119-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27118-7

  • Online ISBN: 978-3-319-27119-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics