Parallel Data Regeneration Based on Multiple Trees with Network Coding in Distributed Storage System

  • Pengfei YouEmail author
  • Zhen Huang
  • Changjian Wang
  • Minghao Hu
  • Yuxing Peng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9529)


Distributed storage systems can provide large-scale data storage and high data reliability by redundant schemes, such as replica and erasure codes. Redundant data may get lost due to frequent node failures in the system. The lost data is needed to be regenerated as soon as possible so as to maintain data availability and reliability. The direct way for reducing regeneration time is to reduce network traffic in regeneration. Compared with that way, tree-structured regeneration achieves shorter regeneration time by constructing better tree-structured topology to increase transmission bandwidth. However, some bandwidth of many other edges beyond the tree is not utilized to speed up transmission in tree-structured regeneration. In this paper, we consider to use multiple edge-disjoint trees to parallel regenerate the lost data, and analyze the total regeneration time. We deduce the formula about optimal regeneration time, and propose an approximate construction algorithm with polynomial time complexity for the optimal multiple regeneration trees. Our experiments shows, the regeneration time reduces 62 % compared with common tree–structured scheme, and the file availability reaches almost 99 %.


Distributed storage systems Data regeneration Erasure code Network coding Overlay P2P Maximum spanning tree 



This research work is supported by National Basic Research Program of China under Grant No.2014CB340303, and The Program of National Natural Science Foundation of China under Grant No.61402514 and No.61402490, and Scientific Research Program of Hunan Provincial Education Department (No.12b012).


  1. 1.
    Rhea, S., Eaton, P., Geels, D., Weatherspoon, H., Zhao, B., Kubia towicz, J.: Pond: the OceanStore Prototype. In: FAST, pp. 1–14 (2003)Google Scholar
  2. 2.
    Huang, C., Simitci, H., Xu, Y., et al.: Erasure coding in windows azure storage. In: Proceedings of the 2012 USENIX Conference on Annual Technical Conference, pp. 2–2. USENIX Association, Boston, MA, USA (2012)Google Scholar
  3. 3.
    Sathiamoorthy, M., Asteris, M., Papailiopoulos, D., et al.: XORing elephants: novel erasure codes for big data. In: Proceedings of the 39th International Conference on Very Large Data Bases, pp. 325–336. VLDB Endowment (2013)Google Scholar
  4. 4.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: SOSP, pp. 29–43 (2003)Google Scholar
  5. 5.
    Guo, C., Lu, G., Li, D., Wu, H., Zhang, X., Shi, Y., Tian, C., Zhang, Y., Lu, S.: BCube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of ACM SIGCOMM 2009 conference on Data communication, pp. 63–74 (2009)Google Scholar
  6. 6.
    Weatherspoon, H., Kubiatowicz, J.D.: Erasure coding vs. replication: a quantitative comparison. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, p. 328. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Rodrigues, R., Zhou, T.H.: High availability in DHTs: erasure coding vs. replication. In: van Renesse, R. (ed.) IPTPS 2005. LNCS, vol. 3640, pp. 226–239. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Acedanski, S., Deb, S., Medard, M., Koetter, R.: How good is random linear coding based distributed networked storage?. In: Proceedings of 1st Workshop on Network Coding, pp. 1–6, Riva del Garda, Italy (2005)Google Scholar
  9. 9.
    Dimakis, A., Godfrey, P., Wainwright, M., Ramchandran, K.: Network coding for distributed storage systems. In: Proceedings of 26th INFOCOM, pp. 2000–2008 (2007)Google Scholar
  10. 10.
    Wu, Y., Dimakis, R., Ramch, K.: Deterministic regenerating codes for distributed storage. In: Allerton Conference on Control, Computing, and Communication, pp. 1–5, Urbana-Champaign, IL (2007)Google Scholar
  11. 11.
    Li, J., Yang, S., Wang, X., Xue, X., Li, B.: Tree-structured data regeneration with network coding in distributed storage systems. In: Proceedings of 17th IEEE International Workshop on Quality of Service (IWQoS), pp. 1–9 (2009)Google Scholar
  12. 12.
    Li, J., Yang, S., Wang, X., Li, B.: Tree-structured data regeneration in distributed storage systems with regenerating codes. In: Proceedings INFOCOM, pp. 1–9 (2010)Google Scholar
  13. 13.
    Ahlswede, R., Cai, N., Li, S.-Y., Yeung, R.: Network information flow. IEEE Trans. Inf. Theory 46(4), 1204–1216 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Duminuco, A., Biersack, E.: Hierarchical codes: how to make erasure codes attractive for peer-to-peer storage systems. In: Eighth International Conference on Peer-to-Peer Computing, pp. 89–98 (2008)Google Scholar
  15. 15.
    Bhagwan, R., Tati, K., Cheng, Y., Savage, S., Voelker, G.: Total recall: system support for automated availability management. In: Proceedings of NSDI 2001, pp. 25–25 (2004)Google Scholar
  16. 16.
    Ho, T., Koetter, R., Medard, M., Karger, D., Effros, M.: The benefits of coding over routing in a randomized setting. In: Proceedings of IEEE International Symposium on Information Theory, pp. 442–447 (2003)Google Scholar
  17. 17.
  18. 18.
    Banerjee, S., Lee, S.-J., Sharma, P., Yalagandula., P.: S3 (Scalable Sensing Service).
  19. 19.
    Stribling., J.: Planetlab All Pairs Ping.
  20. 20.
    Tarjan, R.E.: A good algorithm for edge-disjoint branching. Inf. Process. Lett. 51–53 (1974)Google Scholar
  21. 21.
    Roskind, J., Tarjan, R.E.: A note on finding minimum-cost edge-disjoint spanning trees. Math. Oper. Res. 701–708 (1985)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Pengfei You
    • 1
    Email author
  • Zhen Huang
    • 1
  • Changjian Wang
    • 1
  • Minghao Hu
    • 1
  • Yuxing Peng
    • 1
  1. 1.College of ComputerNational University of Defense TechnologyChangshaChina

Personalised recommendations