Abstract
We propose a new hybrid coding scheme to store data reliably, homomorphic minimum bandwidth repairing (HMBR) codes derived from exact minimum bandwidth regenerating codes (exact-MBR) codes and homomorphic self repairing codes (HSRCs). Exact-MBR codes offer minimum bandwidth usage whereas HSRC has low computational overhead in node repair. Our coding scheme provides two options for node repair operation. The first option offers to repair a node using minimum bandwidth and higher computational complexity while the second one repairs a node using fewer helper nodes, lower computational complexity, lower I/O overhead and higher bandwidth. Our scheme also introduces a basic integrity checking mechanism. Moreover, our proposed codes provide two different data reconstruction methods. The first one has typically better computational complexity while the other requires less bandwidth usage. Our theoretical and experimental results show that the probability of successful node repair in HMBR codes is higher than that of HSRCs and are slightly less than that of exact-MBR codes. Our proposed codes are appropriate for the systems where cost parameters such as computational complexity, bandwidth, the number of helper nodes and I/O can change dynamically. Thus, these systems can choose the appropriate method for node repair as well as the data reconstruction.
Similar content being viewed by others
Notes
Notice that, [n, k, d] HMBR encodes and stores more amount of data than [n, k] HSRC when they use the same field size.
Notice that, \([n,k\ge 2,d]\) HMBR coding scheme encodes and stores more data than [n, k] HSRC, when they use the same field size.
Here, d denotes the symbol size in bits and R(x, d, r) function counts the number of \(x \times d\) binary sub-matrices having rank r [15]. In HMBR, R(x, d, r) can be used for counting all possible alive node permutations having at least k linearly independent polynomial inputs. This function will be explained in Sect. 4.
References
Araujo J, Giroire F, Monteiro J (2011) Hybrid approaches for distributed storage systems. In: Hameurlain A, Tjoa A (eds) Data management in grid and peer-to-peer systems. Lecture notes in computer science, vol 6864. Springer, Berlin, pp 1–12
Bhagwan R, Tati K, Cheng YC, Savage S, Voelker GM (2004) Total recall: system support for automated availability management. In: NSDI. USENIX, Berkeley, pp 337–350
Chen Y, Alspaugh S, Katz R (2012) Interactive analytical processing in big data systems: a cross-industry study of mapreduce workloads. Proc VLDB Endow 5(12):1802–1813
Dickson LE (1901) Linear groups with an exposition of the Galois field theory. B.G. Teubner, Leipzig
Dimakis AG, Godfrey PB, Wu Y, Wainwright MJ, Ramchandran K (2010) Network coding for distributed storage systems. IEEE Trans Inf Theory 56(9):4539–4551
Duminuco A, Biersack E (2008) Hierarchical codes: how to make erasure codes attractive for peer-to-peer storage systems. In: P2P’08. Eighth international conference on peer-to-peer computing, 2008. IEEE, Aachen, pp 89–98
Gabidulin EM (1985) Theory of codes with maximum rank distance. Probl Inf Transm 21(1):3–16 (Translation of Problemy Peredachi Informatsii)
Gaston B, Pujol J, Villanueva M (2001) Quasi-cyclic minimum storage regenerating codes for distributed data compression. In: 2011 IEEE Data compression conference. IEEE, Snowbird, UT, pp 33–42
Haytaoglu E, Dalkilic ME (2013) Homomorphic minimum bandwidth repairing codes. In: Gelenbe E, Lent R (eds) Information sciences and systems 2013. Lecture notes in electrical engineering, vol 264. Springer, Paris, pp 339–348
Huang C, Chen M, Li J (2013) Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Trans Storage 9(1):3:1–3:28
Huang C, Simitci H, Xu Y, Ogus A, Calder B, Gopalan P, Li J, Yekhanin S (2012) Erasure coding in windows azure storage. In: Proceedings of the 2012 USENIX conference on annual technical conference, USENIX ATC’12. p 2
Jeffreys H, Jeffreys B (2000) Lagrange’s interpolation formula. 9.011 in methods of mathematical physics, 3rd edn. Cambridge University Press, Cambridge
Kamath GM, Silberstein N, Prakash N, Rawat AS, Lalitha V, Koyluoglu OO, Kumar PV, Vishwanath S (2013) Explicit MBR all-symbol locality codes. In: 2013 IEEE international symposium on information theory proceedings (ISIT). IEEE, Istanbul, pp 504–508
Kubiatowicz J, Bindel D, Chen Y, Czerwinski S, Eaton P, Geels D, Gummadi R, Rhea S, Weatherspoon H, Weimer W, Wells C, Zhao B (2000) Oceanstore: an architecture for global-scale persistent storage. SIGPLAN Not 35(11):190–201
Oggier F, Datta A (2015) Self-repairing codes. Computing 97(2):171–201
Pamies-Juarez L, Hollmann HDL, Oggier FE (2013) Locally repairable codes with multiple repair alternatives. In: 2013 IEEE international symposium on information theory proceedings (ISIT). IEEE, Istanbul, pp 892–896
Papailiopoulos DS, Dimakis AG (2012) Locally repairable codes. In: 2012 IEEE international symposium on information theory proceedings (ISIT). IEEE, Cambridge, pp 2771–2775
Rai BK, Dhoorjati V, Saini L, Jha AK (2015) On adaptive distributed storage systems. In: 2015 IEEE international symposium on information theory (ISIT). pp 1482–1486
Rashmi KV, Nakkiran P, Wang J, Shah NB, Ramchandran K (2015) Having your cake and eating it too: jointly optimal erasure codes for I/O, storage, and network-bandwidth. In: Proceedings of the 13th USENIX conference on file and storage technologies, FAST 2015, Santa Clara, February 16–19, 2015, pp 81–94
Rashmi KV, Shah NB, Gu D, Kuang H, Borthakur D, Ramchandran K (2013) A solution to the network challenges of data recovery in erasure-coded distributed storage systems: a study on the facebook warehouse cluster. In: USENIX. USENIX, San Jose
Rashmi KV, Shah NB, Kumar PV (2011) Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans Inf Theory 57(8):5227–5239
Reed IS, Solomon G (1960) Polynomial codes over certain finite fields. J Soc Ind Appl Math 8(2):300–304
Shah NB (2013) On minimizing data-read and download for storage-node recovery. IEEE Commun Lett 17(5):964–967
Shum KW (2011) Cooperative regenerating codes for distributed storage systems. In: 2011 IEEE international conference on communications (ICC). IEEE, Kyoto, pp 1–5
Xia M, Saxena M, Blaum M, Pease DA (2015) A tale of two erasure codes in HDFS. In: 13th USENIX conference on file and storage technologies (FAST 15). pp 213–226
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Haytaoglu, E., Dalkilic, M.E. A new hybrid coding scheme: homomorphic minimum bandwidth repairing codes. Computing 99, 1029–1054 (2017). https://doi.org/10.1007/s00607-017-0542-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-017-0542-0