Abstract
Most modern high-performance information processing systems are scalable. Performance improvement in scalable computing systems is achieved by increasing the same type of modules (for example, computing nodes). In the Top 500 list (56 edition, November 2020), 93% of high-performance systems are cluster-based and have high scalability. The number of nodes in these systems can be measured in hundreds of thousands. On the other hand, increasing the resource exacerbates the problem of reliability and complicates the organization of effective functioning. The analysis of the reliability and potential capabilities of computer systems is an urgent problem. Within the framework of the Queuing Theory, a model of the functioning of scalable computer systems in case of failures, taking into account the switching time, is considered. The solutions uses the apparatus of generating functions and combinatorial methods. The paper offers analytical solutions for probability distribution of system states in the case of a model with three parameters for transient and stationary modes of operation. It is shown that the solutions of the three-parameter model are reduced to the solutions of the two-parameter model, if the switching time is not taken into account.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Khoroshevsky, V.G.: Architecture of Computer Systems. BMSTU, Moscow (2008)
TOP500 Supercomputers Official Site, TOP500 Lists (2021). http://www.top500.org
Supercomputer, Fugaku (2021). https://www.fujitsu.com/global/about/innovation/fugaku/
Gupta, S., Patel, T., Engelmann, C., Tiwari, D.: Failures in large scale systems: long-term measurement, analysis, and implications. In: SC ‘17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, Colorado, vol. 44 (2007)
Schroeder, B., Gibson, G.A.: large-scale study of failures in high-performance computing systems. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN2006), Philadelphia, PA, USA, p. 10 (2006)
Vishnevsky, V.M.: Theoretical Foundations of Computer Network Design, p. 512. Technosphere, Moscow (2003)
Khoroshevsky, V.G.: Models of analysis and organization of large-scale distributed computer systems. Electron. Model. (Kiev) 25, 6 (2003)
Xie, M., Dai, Y.S., Poh, K.L.: Computing System Reliability: Models and Analysis. Kluwer Academic Publishers, New York (2004)
Blischke, W.R., Murthy, D.N.P.: Reliability. Wiley, New York (2004)
Hoyland, A., Rausand, M.: System Reliability Theory. Wiley, New York (1994)
Kuo, W., Zuo, M.J.: Optimal Reliability Modeling: Principles and Applications. Wiley, New York (2003)
Chehcelnisky, A.A., Kucherenko, O.V.: Stacionarnye kharakteristiki parallelno funkcioniruyuschih system obslujivaniya s dvumernym vhodnym potokom. Sbornik nauchnyh statei (Minsk) 2, 262–268 (2009)
Nazarov, A.A., Terpugov, A.F.: Queuing Theory, p. 228. NTL, Tomsk (2010)
Saati, T.L.: Elements of Queuing Theory and Its Applications, p. 520. The Book House “LIBROKOM”, Moscow (2010)
Kleinrok, L.: Queuing Theory, p. 432. Mechanical Engineering, Moscow (1979)
Borovkov, A.A.: Veroyatnostnye processy v teorii massovogo obsluzhivaniya, p. 368. Nauka, Moscow (1972)
Ventcel, E.S.: Theory of Random Processes and Its Engineering Applications, p. 384. Nauka, Moscow (1991)
Harchol-Balter, M.: Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press (2013)
William, F.: An Introduction to Probability Theory and Its Applications, vol. 1, p. 528. The Book House “LIBROKOM”, Moscow (2010)
Pavsky, V.A., Pavsky, K.V., Khoroshevsky, V.G.: Vychislenie pokazatelei zhivuchesti raspredelennyh vychislitelnyh system i osuschestvimosti resheniya zadach. Artif. Intell. 4, 28–34 (2006)
Acknowledgements
This work was carried out under state contract with ISP SBRAS (grants No 0242-2021-0011) and under Russian Foundation for Basic Research (grants No 20-07-00039).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pavsky, V.A., Pavsky, K.V. (2022). Mathematical Model with Three Parameters for Calculating Probabilities of States of Scalable Computer Systems. In: Solovev, D.B., Kyriakopoulos, G.L., Venelin, T. (eds) SMART Automatics and Energy. Smart Innovation, Systems and Technologies, vol 272. Springer, Singapore. https://doi.org/10.1007/978-981-16-8759-4_24
Download citation
DOI: https://doi.org/10.1007/978-981-16-8759-4_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8758-7
Online ISBN: 978-981-16-8759-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)