Method for Choosing a Balanced Set of Fault-Tolerance Techniques for Distributed Computer Systems
We consider the problem of choosing a balanced set of fault-tolerance techniques for distributed computer systems. In this problem, it is necessary to choose a balanced set of versions of the modules of distributed computer systems, during which the reliability of the set must be maximized under cost constraints (on the set of possible versions of distributed computer systems). We describe the fault-tolerance techniques out of which the choice is made and consider a mathematical model in the context of which the formulation of the problem and the method of its solution are given. This problem is widely considered in the literature. A detailed description of the method for choosing a balanced set of fault-tolerance techniques for distributed computer systems is presented. The proposed method represents an evolutionary algorithm using the scheme of fuzzy logic. The scheme of fuzzy logic in the process of operating the algorithm analyzes the results of its operation in each generation and from this information adjusts the parameters of the evolutionary algorithm. The method makes it possible to obtain an efficient solution, as shown in the experimental research. A key feature of the proposed approach is the use of an adaptive scheme. The method has been implemented as software integrated with the DYANA simulation environment. The conclusions of the paper contain a brief description of future research directions.
Keywordsreliability fault-tolerance computer systems genetic algorithm reliability optimization problem fault-tolerance techniques evolutionary algorithm
Unable to display preview. Download preview PDF.
- 2.Lupanov, O.B., On a method of synthesis of schemes, Izv. Vyssh. Uchebn. Zaved., Radiofiz., 1958, vol. 1, no. 1, pp. 120–140.Google Scholar
- 3.Kuo, W. and Wan, R., Recent advances in optimal reliability allocation, in Handbook of Military Industrial Engineering, Badiru, A. and Thomas, M., Eds., 2009, pp. 1–24Google Scholar
- 10.Soltani, R., Reliability optimization of binary state non-repairable systems: A state of the art survey, Int. J. Ind. Eng. Comput., 2014, vol. 5, no. 3, pp. 339–364.Google Scholar
- 11.Smeliansky, R.L., A model of the operation of distributed computing systems, Vestn. Mosk. Univ., Ser. 15, Vychisl. Mat. Kibern., 1990, no. 3, pp. 3–21.Google Scholar
- 14.Bakhmurov, A.G., et al., Method for choosing an effective set of fault tolerance mechanisms for real-time embedded systems, based on simulation modeling, Probl. Depend. Modell., 2011, pp. 13–26Google Scholar
- 15.Laprie, J.C. and Coste, A., Dependability: A unifying concept for reliable computing, Proceedings of the 12th Fault Tolerant Computing Symposium, 1982, pp. 18–21Google Scholar
- 16.Xie, Z., Sun, H., and Saluja, K., Survey of Software Fault Tolerance Techniques, University of Wisconsin-Madison, 2006.Google Scholar
- 18.Handbook of Software Reliability Engineering, Lyu, M.R., Ed., McGraw-Hill: IEEE Computer Society Press, 1996.Google Scholar
- 19.Bakhmurov, A.G., Kapitonova, A.P., and Smeliansky, R.L., DYANA: An environment for embedded system design and analysis, Proceedings of 5th International Conference TACAS’99, Amsterdam, 1999, pp. 390–404Google Scholar
- 21.Gladkov, L.A., Kureychik, V.V., and Kureychik, V.M., Geneticheskie algoritmy (Genetic Algorithms), PHIZMATLIT, 2006.Google Scholar
- 22.Chistyakov, V.P., Kurs teorii veroyatnostei (Probability Theory Course), Nauka 1987.Google Scholar