Abstract
A new model describing dependencies among system components as a directed graph is presented and used to solve a novel replica placement problem in data centers. A criterion for optimizing replica placements is formalized and explained. In this work, the optimization goal is to choose placements in which correlated failure events disable as few replicas as possible. A fast optimization algorithm is given for dependency models represented by trees. The main contribution of the paper is an \(O(n + \rho \log \rho )\) dynamic programming algorithm for placing \(\rho \) replicas on a tree with n vertices.
This work was supported, in part, by the National Science Foundation (NSF) under grant number CNS-1115733.
The original version of this chapter was revised: Contents were corrected throughout the chapter. The erratum to this chapter is available at 10.1007/978-3-319-26626-8_60
An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-26626-8_60
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bakkaloglu, M., Wylie, J.J., Wang, C., et. al: On correlated failures in survivable storage systems. Technical report CMU-CS-02-129, Carnegie Mellon University (2002)
Blume, L., Easley, D., Kleinberg, J., Kleinberg, R., Tardos, E.: Which networks are least susceptible to cascading failures? In: Proceedings of the 52nd Annual Symposium on Foundations of Computer Science (FOCS) (2011)
Chen, M., Chen, W., Liu, L., Zheng, Z.: An analytical framework and its applications for studying brick storage reliability. In: Proceedings of the 26th International Symposium on Reliable Distributed Systems (SRDS) (2007)
Ford, D., Labelle, F., Popovici, F., et al.: Availability in globally distributed storage systems. In: Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI) (2010)
Hu, X.D., Jia, X.H., Du, D.Z., et al.: Placement of data replicas for optimal data availability in ring networks. J. Parallel Distrib. Comput. (JPDC) 61(10), 1412–1424 (2001)
Pezoa, J.E., Hayat, M.M.: Reliability of heterogeneous distributed computing systems in the presence of correlated failures. IEEE Trans. Parallel Distrib. Comput. 25(4), 1034–1043 (2014)
Kim, J., Dobson, I.: Approximating a loading-dependent cascading failure model with a branching process. IEEE Trans. Reliab. 59(4), 691–699 (2010)
Lian, Q., Chen, W., Zhang, Z.: On the impact of replica placement to the reliability of distributed brick storage systems. In: Proceedings of the International Conference on Distributed Computing Systems (ICDCS) (2005)
Mills, K.A., Chandrasekaran, R., Mittal, N.: Algorithms for replica placement in high-availability storage (2015). arxiv:1503.02654
Nath, S., Yu, H., Gibbons, P.B., Seshan, S.: Subtleties in tolerating correlated failures in wide-area storage systems. In: Proceedings of the 3rd USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2006)
Shekhar, S., Wu, W.: Optimal placement of data replicas in distributed database with majority voting protocol. Theoret. Comput. Sci. 258(1), 555–571 (2001)
Weatherspoon, H., Moscovitz, T., Kubiatowicz, J.: Introspective failure analysis: avoiding correlated failures in peer-to-peer systems. In: Proceedings of the 21st Symposium on Reliable Distributed Systems (SRDS) (2002)
Zhang, Z., Wu, W., Shekhar, S.: Optimal placements of replicas in a ring network with majority voting protocol. J. Parallel Distrib. Comput. (JPDC) 69(5), 461–469 (2009)
Zhu, Y., Yan, J., Sun, Y., et al.: Revealing cascading failure vulnerability in power grids using risk-graph. IEEE Trans. Parallel Distrib. Syst. (TPDS) 25(12), 3274–3284 (2014)
Acknowledgments
We would like to acknowledge insightful comments from S. Venkatesan and Balaji Raghavachari during meetings about results contained in this paper, as well as comments from Conner Davis on a draft version of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mills, K.A., Chandrasekaran, R., Mittal, N. (2015). On Replica Placement in High-Availability Storage Under Correlated Failure. In: Lu, Z., Kim, D., Wu, W., Li, W., Du, DZ. (eds) Combinatorial Optimization and Applications. Lecture Notes in Computer Science(), vol 9486. Springer, Cham. https://doi.org/10.1007/978-3-319-26626-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-26626-8_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26625-1
Online ISBN: 978-3-319-26626-8
eBook Packages: Computer ScienceComputer Science (R0)