Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Ma, Shuyi; Li, Jin; Li, Jianping; Xie, Min

doi:10.1007/s42524-023-0272-2

Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Research Article
Open access
Published: 08 February 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Frontiers of Engineering Management Aims and scope Submit manuscript

Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Download PDF

Shuyi Ma^1,2,
Jin Li¹,
Jianping Li³ &
…
Min Xie²

393 Accesses
1 Citation
Explore all metrics

Abstract

Cloud systems, which are typical cyber–physical systems, consist of physical nodes and virtualized facilities that collaborate to fulfill cloud computing services. The advent of visualization technology engenders resource sharing and service parallelism in cloud services, introducing novel challenges to system modeling. In this study, we construct a systematic model that concurrently evaluates system reliability, performance, and power consumption (PC) while delineating cloud service disruptions arising from random hardware and software failures. Initially, we depict system states using a birth–death process that accommodates resource sharing and service parallelism. Given the relatively concise service duration and regular failure distributions, we employ transient-state transition probabilities instead of steady-state analysis. The birth–death process effectively links system reliability, performance, and PC through service durations governed by service assignment decisions and failure/repair distributions. Subsequently, we devise a multistage sample path randomization method to estimate system metrics and other factors related to service availability. The findings highlight that the trade-off between performance and PC, under the umbrella of reliability guarantees, hinges on the equilibrium between service duration and unit power. To further delve into the subject, we formulate optimization models for service assignment and juxtapose optimal decisions under varying availability scenarios, workload levels, and service attributes. Numerical results indicate that service parallelism can improve performance and conserve energy when the workload remains moderate. However, as the workload escalates, the repercussions of resource sharing-induced performance loss become more pronounced due to resource capacity limitations. In cases where system availability is constrained, resource sharing should be approached cautiously to ensure adherence to deadline requirements. This study theoretically analyzes the interrelations among system reliability, performance, and PC, offering valuable insights for making informed decisions in cloud service assignments.

Article PDF

Evaluation and design of highly reliable and highly utilized cloud computing systems

Article Open access 28 May 2015

A comprehensive examination of load balancing algorithms in cloud environments: a systematic literature review, comparative analysis, taxonomy, open challenges, and future trends

Article 24 April 2024

Performability analysis of cloud computing centers with large numbers of servers

Article 24 October 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Al-Moalmi A, Luo J, Salah A, Li K, Yin L (2021). A whale optimization system for energy-efficient container placement in data centers. Expert Systems with Applications, 164: 113719
Article Google Scholar
Amdahl G M (1967). Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference. Atlantic City, NJ: Association for Computing Machinery, 483–485
Google Scholar
Ataie E, Entezari-Maleki R, Etesami S E, Egger B, Sousa L, Movaghar A (2022). Modeling and evaluation of dispatching policies in IaaS cloud data centers using SANs. Sustainable Computing: Informatics and Systems, 33: 100617
Google Scholar
Bai X, Li M, Chen B, Tsai W T, Gao J (2011). Cloud testing tools. In: Proceedings of 6th International Symposium on Service Oriented System. Irvine, CA: IEEE, 1–12
Google Scholar
Bennaceur W M, Kloul L (2020). Formal models for safety and performance analysis of a data center system. Reliability Engineering & System Safety, 193: 106643
Article Google Scholar
Bora S, Walker B, Fidler M (2023). The tiny-tasks granularity trade-off: Balancing overhead versus performance in parallel systems. IEEE Transactions on Parallel and Distributed Systems, 34(4): 1128–1144
Article Google Scholar
Canosa-Reyes R M, Tchernykh A, Cortés-Mendoza J M, Pulido-Gaytan B, Rivera-Rodriguez R, Lozano-Rizk J E, Concepcion-Morales E R, Castro Barrera H E, Barrios-Hernandez C J, Medrano-Jaimes F, Avetisyan A, Babenko M, Drozdov A Y (2022). Dynamic performance: Energy tradeoff consolidation with contention-aware resource provisioning in containerized clouds. PLoS One, 17(1): e0261856
Article CAS PubMed PubMed Central Google Scholar
Cao X, Bo H, Liu Y, Liu X (2023). Effects of different resource-sharing strategies in cloud manufacturing: A Stackelberg game-based approach. International Journal of Production Research, 61(2): 520–540
Article Google Scholar
Chinnathambi S, Santhanam A, Rajarathinam J, Senthilkumar M (2019). Scheduling and checkpointing optimization algorithm for Byzantine fault tolerance in cloud clusters. Cluster Computing, 22(S6): 14637–14650
Article Google Scholar
Cotroneo D, de Simone L, Liguori P, Natella R (2022). Fault injection analytics: A novel approach to discover failure modes in cloud-computing systems. IEEE Transactions on Dependable and Secure Computing, 19(3): 1476–1491
Article Google Scholar
Du A Y, Smith S D, Yang Z, Qiao C, Ramesh R (2015). Predicting transient downtime in virtual server systems: An efficient sample path randomization approach. IEEE Transactions on Computers, 64(12): 3541–3554
Article MathSciNet Google Scholar
Eshraghi N, Liang B (2019). Joint offloading decision and resource allocation with uncertain task computing requirement. In: IEEE Conference on Computer Communications. Paris: IEEE, 1414–1422
Google Scholar
Fahmideh M, Beydoun G, Low G (2019). Experiential probabilistic assessment of cloud services. Information Sciences, 502: 510–524
Article Google Scholar
Feng W, Huang M (2015). The research on service composition trust based on cloud computing. In: International Conference on Computer Science and Intelligent Communication. Zhengzhou: Atlantis Press, 291–294
Google Scholar
Garg R, Mittal M, Son L H (2019). Reliability and energy efficient workflow scheduling in cloud environment. Cluster Computing, 22(4): 1283–1297
Article Google Scholar
Guan Z, Ye T, Yin R (2020). Channel coordination under Nash bargaining fairness concerns in differential games of goodwill accumulation. European Journal of Operational Research, 285(3): 916–930
Article MathSciNet Google Scholar
Guo J, Chang Z, Wang S, Ding H, Feng Y, Mao L, Bao Y (2019a). Who limits the resource efficiency of my datacenter: An analysis of Alibaba datacenter traces. In: Proceedings of the 27th International Symposium on Quality of Service. Phoenix, AZ: IEEE, 1–10
Google Scholar
Guo M, Guan Q, Chen W, Ji F, Peng Z (2022). Delay-optimal scheduling of VMs in a queueing cloud computing system with heterogeneous workloads. IEEE Transactions on Services Computing, 15(1): 110–123
Article ADS Google Scholar
Guo Z, Li J, Ramesh R (2019b). Optimal management of virtual infrastructures under flexible cloud service agreements. Information Systems Research, 30(4): 1424–1446
Article Google Scholar
Guo Z, Li J, Ramesh R (2020). Scalable, adaptable, and fast estimation of transient downtime in virtual infrastructures using convex decomposition and sample path randomization. INFORMS Journal on Computing, 32(2): 321–345
MathSciNet Google Scholar
Guo Z, Li J, Ramesh R (2023a). Green data analytics of supercomputing from massive sensor networks: Does workload distribution matter? Information Systems Research, 34(4): 1664–1685
Article Google Scholar
Guo Z, Zhang Y, Liu S, Wang X V, Wang L (2023b). Exploring self-organization and self-adaption for smart manufacturing complex networks. Frontiers of Engineering Management, 10(2): 206–222
Article Google Scholar
Gupta A, Acun B, Sarood O, Kalé L V (2014). Towards realizing the potential of malleable jobs. In: 21st International Conference on High Performance Computing. Goa: IEEE, 1–10
Google Scholar
Han X, Schooley R, Mackenzie D, David O, Lloyd W J (2020). Characterizing public cloud resource contention to support virtual machine co-residency prediction. In: IEEE International Conference on Cloud Engineering. Sydney: IEEE, 162–172
Google Scholar
Harchol-Balter M (2021). Open problems in queueing theory inspired by datacenter computing. Queueing Systems, 97(1–2): 3–37
Article MathSciNet Google Scholar
Ibrahim M, Nabi S, Hussain R, Raza M S, Imran M, Kazmi S M A, Oracevic A, Hussain F (2020). A comparative analysis of task scheduling approaches in cloud computing. In: 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing. Melbourne: IEEE, 681–684
Google Scholar
Islam M T, Karunasekera S, Buyya R (2017). dSpark: Deadline-based resource allocation for big data applications in apache spark. In: IEEE 13th International Conference on E-Science. Auckland: IEEE, 89–98
Google Scholar
Ivanchenko O, Kharchenko V, Moroz B, Ponochovnyi Y, Degtyareva L (2021). Availability assessment of a cloud server system: Comparing Markov and semi-Markov models. In: 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications. Cracow: IEEE, 1–6
Google Scholar
Izrailevsky Y, Bell C (2018). Cloud reliability. IEEE Cloud Computing, 5(3): 39–44
Article Google Scholar
Jian C, Ping J, Zhang M (2021). A cloud edge-based two-level hybrid scheduling learning model in cloud manufacturing. International Journal of Production Research, 59(16): 4836–4850
Article Google Scholar
Levitin G, Xing L, Dai Y (2023). Optimizing partial component activation policy in multi-attempt missions. Reliability Engineering & System Safety, 235: 109251
Article Google Scholar
Li M, Feng J, Xu S X (2023). Toward resilient cloud warehousing via a blockchain-enabled auction approach. Frontiers of Engineering Management, 10(1): 20–38
Article CAS PubMed Central Google Scholar
Li X Y, Liu Y, Lin Y H, Xiao L H, Zio E, Kang R (2021). A generalized petri net-based modeling framework for service reliability evaluation and management of cloud data centers. Reliability Engineering & System Safety, 207: 107381
Article Google Scholar
Liang Y, Lu M, Shen Z M, Tang R (2021). Data center network design for Internet-related services and cloud computing. Production and Operations Management, 30(7): 2077–2101
Article Google Scholar
Lin W, Wang H, Zhang Y, Qi D, Wang J Z, Chang V (2018). A cloud server energy consumption measurement system for heterogeneous cloud environments. Information Sciences, 468: 47–62
Article Google Scholar
Lin W, Wu W, He L (2022). An on-line virtual machine consolidation strategy for dual improvement in performance and energy conservation of server clusters in cloud data centers. IEEE Transactions on Services Computing, 15(2): 766–777
Article Google Scholar
Madni S H H, Abd-Latiff M S, Abdullahi M, Abdulhamid S I M, Usman M J (2017). Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment. PLoS One, 12(5): e0176321
Article PubMed PubMed Central Google Scholar
Malik M K, Singh A, Swaroop A (2022). A planned scheduling process of cloud computing by an effective job allocation and fault-tolerant mechanism. Journal of Ambient Intelligence and Humanized Computing, 13(2): 1153–1171
Article Google Scholar
N’Takpé T, Edgard Gnimassoun J, Oumtanaga S, Suter F (2022). Data-aware and simulation-driven planning of scientific workflows on IaaS clouds. Concurrency and Computation, 34(14): e6719
Article Google Scholar
Niño-Mora J (2019). Resource allocation and routing in parallel multi-server queues with abandonments for cloud profit maximization. Computers & Operations Research, 103: 221–236
Article MathSciNet Google Scholar
Priya V, Sathiya Kumar C, Kannan R (2019). Resource scheduling algorithm with load balancing for cloud service provisioning. Applied Soft Computing, 76: 416–424
Article Google Scholar
Qiu X, Dai Y, Xiang Y, Xing L (2016). A hierarchical correlation model for evaluating reliability, performance, and power consumption of a cloud service. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 46(3): 401–412
Article Google Scholar
Qiu X, Dai Y, Xiang Y, Xing L (2019). Correlation modeling and resource optimization for cloud service with fault recovery. IEEE Transactions on Cloud Computing, 7(3): 693–704
Article Google Scholar
Qiu X, Sun P, Dai Y (2021). Optimal task replication considering reliability, performance, and energy consumption for parallel computing in cloud systems. Reliability Engineering & System Safety, 215: 107834
Article Google Scholar
Sayadnavard M H, Toroghi Haghighat A, Rahmani A M (2019). A reliable energy-aware approach for dynamic virtual machine consolidation in cloud data centers. Journal of Supercomputing, 75(4): 2126–2147
Article Google Scholar
Setlur A R, Nirmala S J, Singh H S, Khoriya S (2020). An efficient fault tolerant workflow scheduling approach using replication heuristics and checkpointing in the cloud. Journal of Parallel and Distributed Computing, 136: 14–28
Article Google Scholar
Sharma Y, Si W, Sun D, Javadi B (2019). Failure-aware energy-efficient VM consolidation in cloud computing systems. Future Generation Computer Systems, 94: 620–633
Article Google Scholar
Tian Y, Tian J, Li N (2020). Cloud reliability and efficiency improvement via failure risk based proactive actions. Journal of Systems and Software, 163: 110524
Article Google Scholar
Wang F, Laili Y, Zhang L (2021a). A many-objective memetic algorithm for correlation-aware service composition in cloud manufacturing. International Journal of Production Research, 59(17): 5179–5197
Article Google Scholar
Wang S, Li X, Ruiz R (2020). Performance analysis for heterogeneous cloud servers using queueing theory. IEEE Transactions on Computers, 69(4): 563–576
Article MathSciNet Google Scholar
Wang T, Zhou J, Li L, Zhang G, Li K, Hu X S (2022). Deadline and reliability aware multiserver configuration optimization for maximizing profit. IEEE Transactions on Parallel and Distributed Systems, 33(12): 3772–3786
Article Google Scholar
Wang Y, Zhang L, Yu P, Chen K, Qiu X, Meng L, Kadoch M, Cheriet M (2021b). Reliability-oriented and resource-efficient service function chain construction and backup. IEEE eTransactions on Network and Service Management, 18(1): 240–257
Article CAS Google Scholar
Xu X, Mo R, Yin X, Khosravi M R, Aghaei F, Chang V, Li G (2021). PDM: Privacy-aware deployment of machine-learning applications for industrial cyber-physical cloud systems. IEEE Transactions on Industrial Informatics, 17(8): 5819–5828
Article Google Scholar
Zaloumis C (2022). Are your data centers keeping you from sustainability? Online Article
Zhang C, Kumbhare A G, Manousakis I, Zhang D, Misra P A, Assis R, Woolcock K, Mahalingam N, Warrier B, Gauthier D, Kunnath L, Solomon S, Morales O, Fontoura M, Bianchini R (2021). Flex: High-availability datacenters with zero reserved power. In: ACM/IEEE 48th Annual International Symposium on Computer Architecture. Valencia: IEEE, 319–332
Google Scholar
Zhang C, Yao J, Qi Z, Yu M, Guan H (2014). vGASA: Adaptive scheduling algorithm of virtualized GPU resource in cloud gaming. IEEE Transactions on Parallel and Distributed Systems, 25(11): 3036–3045
Article Google Scholar
Zhang P, Fang J, Yang C, Huang C, Tang T, Wang Z (2020). Optimizing streaming parallelism on heterogeneous many-core architectures. IEEE Transactions on Parallel and Distributed Systems, 31(8): 1878–1896
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Management, Xi’an Jiaotong University, Xi’an, 710049, China
Shuyi Ma & Jin Li
Department of Systems Engineering, City University of Hong Kong, Hong Kong, China
Shuyi Ma & Min Xie
School of Economics and Management, University of Chinese Academy of Sciences, Beijing, 100190, China
Jianping Li

Authors

Shuyi Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Li
View author publications
You can also search for this author in PubMed Google Scholar
Min Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin Li.

Ethics declarations

Competing Interests The authors declare that they have no competing interests.

Additional information

This research was supported by the National Natural Science Foundation of China (Grant Nos. 72372131, T2293774, and 71901169), the Shaanxi Province Innovative Talents Promotion Plan–Youth Science and Technology Nova Project (Grant No. 2022KJXX-50), and the Youth Talent Promotion Project of China Association for Science and Technology (Grant No. YESS20200072).

Electronic Supplementary Material

42524_2023_272_MOESM1_ESM.pdf

Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ma, S., Li, J., Li, J. et al. Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services. Front. Eng. Manag. (2024). https://doi.org/10.1007/s42524-023-0272-2

Download citation

Received: 30 April 2023
Revised: 31 July 2023
Accepted: 28 August 2023
Published: 08 February 2024
DOI: https://doi.org/10.1007/s42524-023-0272-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Abstract

Article PDF

Similar content being viewed by others

Evaluation and design of highly reliable and highly utilized cloud computing systems

A comprehensive examination of load balancing algorithms in cloud environments: a systematic literature review, comparative analysis, taxonomy, open challenges, and future trends

Performability analysis of cloud computing centers with large numbers of servers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Electronic Supplementary Material

42524_2023_272_MOESM1_ESM.pdf

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cloud-integrated cyber–physical systems: Reliability, performance and power consumption with shared-servers and parallelized services

Abstract

Article PDF

Similar content being viewed by others

Evaluation and design of highly reliable and highly utilized cloud computing systems

A comprehensive examination of load balancing algorithms in cloud environments: a systematic literature review, comparative analysis, taxonomy, open challenges, and future trends

Performability analysis of cloud computing centers with large numbers of servers

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Electronic Supplementary Material

42524_2023_272_MOESM1_ESM.pdf

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation