Abstract
We consider the problem of choosing a balanced set of fault-tolerance techniques for distributed computer systems. In this problem, it is necessary to choose a balanced set of versions of the modules of distributed computer systems, during which the reliability of the set must be maximized under cost constraints (on the set of possible versions of distributed computer systems). We describe the fault-tolerance techniques out of which the choice is made and consider a mathematical model in the context of which the formulation of the problem and the method of its solution are given. This problem is widely considered in the literature. A detailed description of the method for choosing a balanced set of fault-tolerance techniques for distributed computer systems is presented. The proposed method represents an evolutionary algorithm using the scheme of fuzzy logic. The scheme of fuzzy logic in the process of operating the algorithm analyzes the results of its operation in each generation and from this information adjusts the parameters of the evolutionary algorithm. The method makes it possible to obtain an efficient solution, as shown in the experimental research. A key feature of the proposed approach is the use of an adaptive scheme. The method has been implemented as software integrated with the DYANA simulation environment. The conclusions of the paper contain a brief description of future research directions.
Similar content being viewed by others
References
Avizienis, A., Laprie, J.-C., and Randell, B., Dependability and its threats: A taxonomy, Build. Inf. Soc., 2004, vol. 156, pp. 91–120.
Lupanov, O.B., On a method of synthesis of schemes, Izv. Vyssh. Uchebn. Zaved., Radiofiz., 1958, vol. 1, no. 1, pp. 120–140.
Kuo, W. and Wan, R., Recent advances in optimal reliability allocation, in Handbook of Military Industrial Engineering, Badiru, A. and Thomas, M., Eds., 2009, pp. 1–24
Chern, M.S., On the computational complexity of reliability redundancy allocation in a series system, Build. Inf. Soc., 1992, vol. 11, no. 5, pp. 309–315.
Tillman, F.A., Hwang, C.L., and Kuo, W., Optimization techniques for systems reliability with redundancy–A review, IEEE Trans. Reliab., 1977, vol. R-26, no. 3, pp. 148–155.
Misra, K.B., On optimal reliability design: A review, Syst. Sci., 1986, vol. 12, pp. 5–30.
Kuo, W. and Prasad, V.R., An annotated overview of system-reliability optimization, IEEE Trans. Reliab., 2000, vol. 49, no. 2, pp. 176–187.
Gen, M. and Yun, Y.S., Soft computing approach for reliability optimization: State-of-the-art survey, Reliab. Eng. Syst. Saf., 2006, vol. 91, no. 9, pp. 1008–1026.
Aleti, A., et al., Software architecture optimization methods: A systematic literature review, IEEE Trans. Software, 2006, vol. 39, no. 5, pp. 658–683.
Soltani, R., Reliability optimization of binary state non-repairable systems: A state of the art survey, Int. J. Ind. Eng. Comput., 2014, vol. 5, no. 3, pp. 339–364.
Smeliansky, R.L., A model of the operation of distributed computing systems, Vestn. Mosk. Univ., Ser. 15, Vychisl. Mat. Kibern., 1990, no. 3, pp. 3–21.
Wattanapongsakorn, N. and Levitan, S.P., Reliability optimization models for embedded systems with multiple applications, IEEE Trans. Reliab., 2004, vol. 53, no. 3, pp. 406–416.
Wattanapongsakorn, N. and Coit, D.W., Fault-tolerant embedded system design and optimization considering reliability estimation uncertainly, Reliab. Eng. Syst. Saf., 2007, vol. 92, no. 4, pp. 395–407.
Bakhmurov, A.G., et al., Method for choosing an effective set of fault tolerance mechanisms for real-time embedded systems, based on simulation modeling, Probl. Depend. Modell., 2011, pp. 13–26
Laprie, J.C. and Coste, A., Dependability: A unifying concept for reliable computing, Proceedings of the 12th Fault Tolerant Computing Symposium, 1982, pp. 18–21
Xie, Z., Sun, H., and Saluja, K., Survey of Software Fault Tolerance Techniques, University of Wisconsin-Madison, 2006.
Laprie, J.C., et al., Definition and analysis of hardware and software-fault-tolerant architectures, IEEE Comput., 1990, vol. 23, no. 7, pp. 39–51.
Handbook of Software Reliability Engineering, Lyu, M.R., Ed., McGraw-Hill: IEEE Computer Society Press, 1996.
Bakhmurov, A.G., Kapitonova, A.P., and Smeliansky, R.L., DYANA: An environment for embedded system design and analysis, Proceedings of 5th International Conference TACAS’99, Amsterdam, 1999, pp. 390–404
Bakhmurov, A.G., Integrated environment for the analysis and design of distributed real-time embedded computing systems, Progr. Comput. Software, 2013, vol. 39, no. 5, pp. 242–254.
Gladkov, L.A., Kureychik, V.V., and Kureychik, V.M., Geneticheskie algoritmy (Genetic Algorithms), PHIZMATLIT, 2006.
Chistyakov, V.P., Kurs teorii veroyatnostei (Probability Theory Course), Nauka 1987.
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © D.Yu. Volkanov, 2016, published in Modelirovanie i Analiz Informatsionnykh Sistem, 2016, Vol. 23, No. 2, pp. 119–136.
About this article
Cite this article
Volkanov, D.Y. Method for Choosing a Balanced Set of Fault-Tolerance Techniques for Distributed Computer Systems. Aut. Control Comp. Sci. 51, 539–550 (2017). https://doi.org/10.3103/S0146411617070239
Received:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411617070239