1 Introduction

Cloud Computing has transformed the industrial landscape with numerous advantages. It has significantly reduced IT infrastructure costs, streamlined product deployment processes, lowered maintenance expenses, and enabled rapid resource allocation to handle unforeseen demands [1]. According to Sharifi, cloud data centers are more energy-efficient than traditional on-premises setups [2]. This improved energy efficiency not only results in cost savings but also contributes to reducing environmental impacts by minimizing energy consumption and greenhouse gas emissions, as demonstrated by Nara et al. [3]. However, projections indicate that electricity demand for data centers will further increase significantly if current trends continue, emphasizing the importance of ongoing advancements in cloud technology to address resource management and power consumption optimization challenges [4]. Therefore, improving cloud services and harnessing the full potential of cloud data centers for resource management are essential objectives for both the academic and business sectors.

In this study, we strive to improve resource utilization and reduce power consumption by relocating software containers. This approach is designed to improve the balance of resources within the infrastructure. Several studies have explored this issue in the past few years. Zheng et al. [5] proposed a supercomputing cluster based on software container virtualization incorporating an autoregressive model and resource-load balance algorithms. This approach minimizes energy consumption and reduces service level agreement (SLA) violations. Also, the research conducted by Smimite et al. [6] suggested a methodology that utilizes RAM thresholds to facilitate software container migration over virtual machines (VMs) to reduce power consumption and migration time. Additionally, Wei et al. [7] presented an online software container orchestration algorithm based on Lyapunov optimization and Markov approximation, demonstrating a power consumption reduction in data centers without requiring future information.

The motivation behind this study lies in the need for proper resource management and prevention of resource shortages by applying software container migration. This strategy is critical to effective resource management and preservation. Efficient migration offers two significant benefits: the first is the improvement of quality of services (QoS), which is accomplished by optimizing parameters such as response time, CPU load, and memory utilization. To maintain QoS, cloud resource providers and their customers agree to uphold it through SLAs. The second advantage is reducing overall energy consumption by preventing resource overload on the hosting infrastructure.

The main contributions of this paper are as follows.

  1. 1.

    We introduce a novel multi-objective ILP approach for software container replacements, which sets itself apart by incorporating dual objective functions within the model and constraints to prevent resource scarcity throughout the entire system. The primary goals of this model are to reduce the gap between the average CPU utilization of the entire system and that of the utilized physical servers and to lower the probability of migrating software containers after migration events.

  2. 2.

    We present a specialized algorithm designed to significantly improve the efficiency of the proposed replacement model compared with the genetic algorithm (GA) and particle swarm optimization (PSO). The GA is customized, and PSO is introduced as an alternative approach for the proposed model.

  3. 3.

    We performed simulations to evaluate the efficacy of our proposed deployment strategy. The results of our experiments confirm that incorporating our model with the proposed algorithm can significantly reduce overall energy consumption and effectively mitigate resource scarcity within host systems.

The article is organized as follows: Sect. 2 provides a thorough literature review and a comprehensive summary of relevant work. Section 3 defines the problem, outlining its significance and the extent of its impact. Section 4 outlines the mathematical model and provides an overview of the formal framework used in the analysis and the benchmark model, which serves as the basis for comparison. Section 5 focuses on data collection and the solution methodology and describes the data sources and approaches to tackle the problem. The experimental results are presented in Sect. 6, which presents the findings of our analysis. Finally, Sect. 7 concludes the study and summarizes the essential findings and their implications for future research.

2 Related work

The existing scholarly corpus encompasses many works focused on scrutinizing the resource management of software containers. The extant literature analysis can be delineated into two directions:

  1. 1.

    Methodologies and tactics for auto-scaling and

  2. 2.

    Orchestration and administration of software containerized environments.

2.1 Methodologies and tactics about auto-scaling

Numerous studies have been undertaken on software container auto-scaling to address the challenges of dynamically adapting resources to accommodate fluctuating workload requirements.

In a complementary study, Fourati et al. (2022) [8] introduced and evaluated an EPMA framework to enhance resource management within microservice architectures. The framework integrates adaptive resource allocation mechanisms, predictive analysis, and Machine Learning (ML) techniques to optimize resource elasticity according to the prevailing workload and performance demands. The EPMA framework demonstrated its efficacy in achieving optimal resource elasticity for microservice-based applications, as indicated by the findings of this study.

In addition, Sheganaku et al. (2023) [9] investigated the design and implementation of strategies to optimize resource allocation and scale decisions to minimize operational expenses. By exploring various auto-scaling techniques, this study systematically analyzed their effectiveness in dynamically adjusting resource provisioning to satisfy varying workload demands. This study focused on strategies prioritizing cost considerations while ensuring optimal performance and scalability. Additionally, this study contributes to the field by providing insights into practical methods for achieving cost-efficient auto-scaling in software containerized elastic processes.

Qian et al. (2023) [10] devised and evaluated a load-balancing scheduling mechanism that optimizes resource utilization and application performance in the context of OpenStack-Docker integration. The authors proposed an approach that dynamically allocates resources to software containers based on workload characteristics, aiming to mitigate resource underutilization and overloading. This study combined OpenStack's infrastructure management capabilities and Docker's containerization technology to implement the proposed load-balancing mechanism. Key performance metrics such as resource utilization and application responsiveness were evaluated to determine the efficacy of the load-balancing scheduling mechanism.

2.2 Container orchestration and management

Numerous studies have been undertaken on software container orchestration and management to address the challenges of dynamically adjusting resources to accommodate fluctuating workload demands.

The introduction of an Adaptive Container Deployment (ACD) model for the deployment and adaptation of containerized applications in geo-distributed settings was presented in [11]. This model serves as a framework for managing containerized applications in such environments and examines the efficacy of several greedy heuristics in determining the behavior of software container deployment. ACD, which serves as the core subject of this research, was proven to be a highly adaptable model that can efficiently optimize various runtime deployment objectives by effectively utilizing the horizontal and vertical elasticities of the software containers.

Rossi et al. [12] proposed a thorough two-step approach for the organization of container-based application deployment in geo-distributed computing environments in 2019. The proposed methodology is specifically designed to address both the horizontal and vertical elasticity aspects while simultaneously promoting an effective software container arrangement and enhancing adaptability during runtime. The first step in leveraging Reinforcement Learning (RL) is to dynamically control the elasticity of the software containers. In the following step, the study confronts the issue of software container placement by resolving an ILP problem or employing a network-aware heuristic. This study proposes a two-pronged approach that not only pertains to the administration of adaptability during runtime for containerized applications but also emphasizes the importance of employing RL for elasticity management and resolving intricate placement issues using advanced optimization methods.

Yadav et al. (2021) [13] considered the challenges of maintaining the sustainability of containerized applications, which have gained significant importance in modern computing environments. This study emphasizes the role of ML in enhancing resource utilization, optimizing performance, and mitigating potential issues in containerized systems. The authors highlighted the potential of ML algorithms to identify and resolve sustainability-related concerns proactively.

Yadav et al. (2022) [14] addressed the concept of software container elasticity in the context of the Docker technology. This study focuses on enhancing containerized applications' scalability and resource utilization by dynamically adjusting the number of software containers based on response time. A mechanism that monitors the response time of containerized applications is proposed, and the automatic scaling of software containers is initiated to meet the performance requirements. This study's contribution is optimizing containerized application deployment and management by utilizing the response time as a critical metric for elastic scaling.

Finally, Daradkeh et al. [15] addressed the modeling and optimization of cloud-based systems using a microservice architecture. This study employs simulation modeling to investigate resource allocation and release dynamics in response to varying workloads. This research provides insights into performance metrics under different scenarios, contributing to knowledge about efficient cloud resource management.

Given the significant similarities between the proposed ILP and the one used in [11] and [12], their respective models make them viable benchmark models in our study.

Most proposals on containerization scheduling focus on a load-balancing scheduling mechanism that optimizes resource utilization and avoids resource shortages. Complementary to this proposal, we focus on decreasing the distance of the average of total selected server CPU usage and CPU usage of each physical server to decrease the total power consumption. Thus, we replaced the software containers to ensure that the CPU usage of the physical servers is close to the total average CPU usage. Specifically, we propose a novel replacement multi-objective model and a new algorithm to convey that the CPU usage of hosts is close to the total average CPU utilization of the physical servers.

3 Problem definition

Containerization, popularized by Docker and Kubernetes, has significantly altered how software is deployed. At the same time, the extensive use of containerized applications has raised questions regarding potential resource shortages.

Software container migration, a pivotal process in software container management and orchestration, has gained prominence due to its ability to address resource shortages. By migrating software containers between different hosts or VMs, we can implement load-balancing strategies that optimize resource utilization and consistently meet application requirements. Research by Huang et. al [16] and Lu et al. [17] underscores the detrimental effects of inadequately allocated CPU resources, leading to latency and throughput issues, and memory shortages, resulting in crashes.

Second, a management or orchestration service for software containers responsible for load distribution within a physical server may relocate one service from an overloaded host to another to prevent service degradation. Alternatively, the same management or orchestration service may relocate all services from one host to another to activate its power saving mode. Kaushik et al. [18] highlighted that thoughtful resource management can considerably lower energy usage, aligning with green computing goals. Implementing adaptive, energy-aware computation offloading can effectively mitigate the energy consumption of CC systems, which are projected to constitute a substantial portion of the overall energy usage within the cloud infrastructure.

Although the migration of software containers across data centers offers improved performance and resource utilization, some critical challenges impede this process.

  1. 1.

    Migrating software containers between VMs incurs a processing overhead, which temporarily affects application performance.

  2. 2.

    The frequent migration of software containers can lead to a breach of SLAs, increase the operational workload, and put application dependability at risk.

  3. 3.

    Software container resource needs change rapidly, complicating accurate provisioning and potentially causing resource wastage or shortage.

To address these issues, this study presents a mathematical model that aims to replace the location of software containers between the VMs. The goal of the proposed model is to address the scarcity of resources in software containers, VMs, and physical servers while also reducing the overall power consumption of data centers and to propose an algorithm to tackle the challenges that arise when migrating software containers. This algorithm is designed to optimize the allocation and replacement of software containers and improve the efficiency of resource utilization. To assess its effectiveness in managing horizontal scaling complexities, its performance has been evaluated and compared with that of the existing algorithms.

4 Mathematical replacement models

Here, we propose a novel ILP software container replacement model, a ground-breaking innovation aimed at optimizing CPU utilization, reducing energy consumption, and preventing potential resource shortages, such as memory and network bandwidth. This model represents a innovative approach, harmonizing computational efficiency with resource allocation dynamics. Furthermore, we compare our model with the benchmark framework proposed by Rossi et al. in 2019 [19], evaluating its efficacy in the current scientific landscape.

4.1 Mathematical definition and formulation of the proposed model

Our proposed model, Resource-Optimized Container Allocation and Migration System (ROCAMS), directly responds to resource utilization and energy consumption challenges in physical server settings. It focuses on the crucial relationship between CPU usage and power consumption. Our model strives to optimize the distribution of resources among hosts to minimize energy consumption while maintaining operational efficiency.

To achieve these objectives, the model defines objective functions and constraints tailored to address the various aspects of resource utilization and distribution. The objective functions prioritize minimizing excessive software container migrations and reducing disparities in CPU, memory, and network bandwidth usage across the servers. Meanwhile, the constraints ensure sufficient resources for software containers within VMs and maintain CPU usage within specified bounds relative to the average usage of the selected server groups, which are discussed in Sect. 5. In addition, the constraints address memory, network bandwidth, and network latency considerations to ensure optimal performance and resource allocation.

The symbols used in the problem formulation are as follows:

  • \(p= \left\{{p}_{1}, {p}_{2},...,{p}_{n}\right\}\) is the set of physical servers with the same configuration in the cloud data center.

  • \(v= \left\{{v}_{1}, {v}_{2},...,{v}_{n}\right\}\), \(u= \left\{{u}_{1}, {u}_{2},...,{u}_{n}\right\}\) are the sets of different types of VM.

  • \(e= \left\{{e}_{1}, {e}_{2},...,{e}_{n}\right\}\) is the set of software containers in the cloud data center.

  • \({w}_{\text{cpu}}, {w}_{\text{memory}} and {w}_{\text{bnw}}\) are defined as the scale factors of the CPU, memory, and network bandwidth in objective function 2, respectively.

  • \({cpu}_{\text{average}}\) is the average CPU usage (in percentage) between selected servers (n physical servers) for resource balancing.

  • \({w}_{e}=\left\{{w}_{\text{cont}}^{1},{w}_{\text{cont}}^{2},...,{w}_{\text{cont}}^{l}\right\}\) is the migration weight of software container e (e ∈ E).

  • \({\text{imagesize}}_{e}=\left\{{\text{imagesize}}_{\text{cont}}^{1},{\text{imagesize}}_{\text{cont}}^{2},...,{\text{imagesize}}_{\text{cont}}^{l}\right\}\) denotes the image size of software container e (e ∈ E).

  • \({TCP}_{\text{cpu}}=\left\{{TCP}_{\text{cpu}}^{1},{TCP}_{\text{cpu}}^{2},...,{TCP}_{\text{cpu}}^{n}\right\}\) is the total capacity of the CPU in host p (p ∈ P).

  • \({TCP}_{\text{mem}}=\left\{{TCP}_{\text{mem}}^{1},{TCP}_{\text{mem}}^{2},...,{TCP}_{\text{mem}}^{n}\right\}\) is the total memory capacity of host p (p ∈ P).

  • \({TCP}_{\text{bnw}}=\left\{{TCP}_{\text{bnw}}^{1},{TCP}_{\text{bnw}}^{2},...,{TCP}_{\text{bnw}}^{n}\right\}\) is the total bandwidth capacity of host p (p ∈ P).

  • \({TCV}_{\text{cpu}}=\left\{{TCV}_{\text{cpu}}^{1},{TCV}_{\text{cpu}}^{2},...,{TCV}_{\text{cpu}}^{n}\right\}\) is the total capacity of the CPU in VM v (v ∈ V).

  • \({TCV}_{\text{mem}}=\left\{{TCV}_{\text{mem}}^{1},{TCV}_{\text{mem}}^{2},...,{TCV}_{\text{mem}}^{n}\right\}\) is the total memory capacity of VM v (v ∈ V).

  • \({TCV}_{bnw}=\left\{{TCV}_{\text{bnw}}^{1},{TCV}_{\text{bnw}}^{2},...,{TCV}_{\text{bnw}}^{n}\right\}\) is the total bandwidth capacity of VM v (v ∈ V).

  • \({c}_{e}\) is the CPU consumption of software container e (e ∈ E).

  • \({mem}_{e}\) is the memory consumption of software container e (e ∈ E).

  • \({bnw}_{e}\) is the network bandwidth consumption of software container e (e ∈ E).

  • \({\text{cpu}-\text{correlation}}_{(\text{cont},\text{host})}\) is the correlation between the CPU unit allocated to software container e (e ∈ E) and the total CPU in the physical server (p ∈ P).

  • \({\text{mem}-\text{correlation}}_{(\text{cont},\text{host})}\) is the correlation between the memory unit allocated to software container e (e ∈ E) and the total memory in the physical server (p ∈ P).

  • \({\text{bnw}-\text{correlation}}_{(\text{cont},\text{host})}\) is the correlation between the network bandwidth in software container e (e ∈ E) and the total network bandwidth in the physical server (p ∈ P).

  • \({d}_{(u,v)}\) is the network delay between software containers u (u ∈ E) and v (v ∈ E). In the same data center, it is assumed to be 50 ms, and between the data centers, it is assumed to be 150 ms (Rossi et al., 2019) [19].

  • \({x}_{\overline{upvq} }\); if there is a connection between software container u (u ∈ E) in the physical server p (p ∈ P) and software container v (v ∈ E) in the physical server q (p ∈ P), this value equals 1; otherwise, it is 0.

  • \({D}_{\text{max}}\) denotes the maximum deployment time between physical servers.

  • \({x{\prime}}_{\text{ev}}\) is the placement of the software container e (e ∈ E) on VM v (v ∈ V) as determined at time step i-1 (last rebalancing), and the model is solved at time step i.

  • \({z}_{\text{vp}}\) is the VM placement v (v ∈ V) on physical server p (p ∈ P), as determined at time step i. In this model, there is no VM replacement, and because of that, the values are always the same.

The independent variables of the proposed replacement model, denoted as \({x}_{ev}\), are defined as 1 when the software container e (e ∈ E) is allocated in VM v (v ∈ V) and 0 in all other cases. This model is formulated using ILP for optimization purposes.

If a software container utilizes more than 90% of its allocated resources, the resource usage of the identified software containers (\({c}_{e}, {mem}_{e},{bnw}_{e}\)) is predicted using a neural network algorithm (Sect. 5.1). Conversely, the average resource usage over the last 120 s is used.

The proposed replacement model is formulated as follows:

Minimizing

$$ \left( {1/2\sum_{(e \in E)} \sum_{(v \in V)} \sum w_{e} image\;size_{e} \left| {x_{{{\text{ev}}}}^{\prime } - x_{{{\text{ev}}}} } \right|} \right) $$
(1)

Minimizing

$$ \left( \begin{gathered} \sum_{p \in P} \in w_{{{\text{cpu}}}} \left( {cpu_{{{\text{average}}}} TCP_{{{\text{cpu}}}}^{p} - \sum_{(v \in V)} \sum_{e \in E} {\text{cpu}} - {\text{correlation}}_{(e,p)} c_{e} x_{{{\text{ev}}}} z_{{{\text{vp}}}} } \right) \hfill \\ + w_{{{\text{mem}}}} \left( {TCP_{{{\text{mem}}}}^{p} - \sum_{v \in V} \sum_{e \in E} {\text{mem}} - {\text{correlation}}_{(e.p)} mem_{e} x_{{{\text{ev}}}} z_{{{\text{vp}}}} } \right) \hfill \\ + w_{{{\text{bnw}}}} \left( {TCP_{{{\text{bnw}}}}^{p} - \sum_{v \in V} \sum_{e \in E} {\text{bnw}} - {\text{correlation}}_{(e,p)} bnw_{e} x_{{{\text{ev}}}} z_{{{\text{vp}}}} } \right) \hfill \\ \end{gathered} \right) $$
(2)
$${\sum }_{e\in E}{c}_{e}{x}_{\text{ev}}\le {TCV}_{cpu}^{v},\forall v\in V$$
(3)
$${\sum }_{e\in E}{mem}_{e}{x}_{\text{ev}}\le {TCV}_{\text{mem}}^{v},\forall v\in V$$
(4)
$$ \mathop \sum \limits_{{e \in { }E}} bnw_{e} x_{{{\text{ev}}}} \le TCV_{{{\text{bnw}}}}^{v} ,\forall v\, \in \,V $$
(5)
$${\sum }_{v\in V}{\sum }_{e\in E}{\text{cpu}-\text{correlation}}_{(e,p)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\le {\text{UpperBound} cpu}_{\text{average}}{TCP}_{\text{cpu}}^{p},\forall p\in P$$
(6)
$${\sum }_{v\in V}{\sum }_{e\in E}{\text{cpu}-\text{correlation}}_{(e,p)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\ge {\text{LowerBound} cpu}_{\text{average}}{TCP}_{\text{cpu}}^{p},\forall p\in P$$
(7)
$${\sum }_{v\in V}{\sum }_{e\in E}{mem-\text{correlation}}_{(e,p)}{mem}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\le {TCP}_{mem}^{p},\forall p\in P$$
(8)
$${\sum }_{v\in V}{\sum }_{e\in E}{bnw-\text{correlation}}_{(e,p)}{bnw}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\le {TCP}_{\text{bnw}}^{p},\forall p\in P$$
(9)
$${\sum }_{v\in V}{x}_{\text{ev}}=1, \forall p\in P$$
(10)
$${z}_{\text{up}}+{z}_{\text{vp}}\le {z}_{\overline{\text{uv}}}\le \frac{{z }_{\text{up}}+{z}_{\text{vp}}}{2}, \forall p\in P, \forall u,v\in V$$
(11)
$${d}_{(u,v)}{z}_{\overline{\text{uv}}}\le N{D }_{\text{max}}$$
(12)
$${x}_{\text{ev}}\in \left\{0, 1\right\}, \forall e\in E, \forall v\in V$$
(13)
$${x{\prime}}_{\text{ev}}\in \left\{0, 1\right\}, \forall e\in E, \forall v\in V$$
(14)
$${z}_{v}\in \left\{0, 1\right\}, \forall v\in V$$
(15)

4.1.1 Objective functions

The primary goal of the formulated objective function (1) is to avoid excessive migration of migrated software containers between the VMs. When a software container migrates, its associated weight (\({w}_{e}\)) increases, which affects the probability of future software container selection. The binary variable \(\left|{x{\prime}}_{\text{ev}}-{x}_{\text{ev}}\right|\) is set to one when a software container named e (e ∈ E) resides in VM v (v ∈ V) during time step i or i-1 and moves between these two time steps. It is zero otherwise. The objective function encourages the selection of software containers with lower migration weights and disk image sizes; this optimizes the solution to reduce the number of migrations and enhances overall efficiency. Objective 2 aims to reduce the disparity in the average CPU usage between the host and the typical value of the chosen hosts, as well as the disparity in memory and network bandwidth usage and the allocated resource value of the selected host. In other words, the objective function minimizes the difference between the total CPU usage of the software containers (\({\sum }_{v\in V}{\sum }_{e\in E}{\text{cpu}-\text{correlation}}_{(e,p)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\)) and the average CPU usage of the physical server (\({cpu}_{\text{average}}{TCP}_{\text{cpu}}^{p}\)), as well as between the total memory (\({\sum }_{v\in V}{\sum }_{e\in E}{mem-\text{correlation}}_{(e,p)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\)) and network bandwidth consumption (\({\sum }_{v\in V}{\sum }_{e\in E}{bnw-\text{correlation}}_{(e,p)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\)) of the software containers and the allocated resources in the physical server.

4.1.2 Constraints

Constraints (3), (4), and (5) are formulated to mitigate potential shortages in CPU, memory, and network bandwidth resources for software containers residing within VMs, where\({\sum }_{e\in E}{c}_{e}{x}_{\text{ev}}\),\({\sum }_{e\in E}{mem}_{e}{x}_{\text{ev}}\), and \({\sum }_{e\in E}{bnw}_{e}{x}_{\text{ev}}\) are the summation of the CPU, memory, and network bandwidth consumption of the located VM. Considering the solid relationship between CPU utilization and energy consumption, as highlighted in constraints (6) and (7), it is crucial to maintain the CPU resource usage close to the average CPU usage of the selected server group. By adjusting the upper_bound and lower_bound parameters, the scope of the restriction can be narrowed, resulting in a host CPU usage that closely resembles the average CPU usage of the chosen servers. The expected perspective is that the CPU resource consumption on a physical server (∀ p ∈ P,\({\sum }_{v\in V}{\sum }_{e\in E}{\text{cpu}-\text{correlation}}_{\left(e,p\right)}{c}_{e}{x}_{\text{ev}}{z}_{\text{vp}}\)) should not exceed the upper bound of the average CPU utilization of the chosen host systems (\({\text{UpperBound} cpu}_{\text{average}}{TCP}_{\text{cpu}}^{p}\)), but should surpass the lower bound of the average CPU utilization of the selected hosts (\({\text{LowerBound} cpu}_{\text{average}}{TCP}_{\text{cpu}}^{p}\)).

Constraints (8) and (9) are postulated to mitigate memory and network bandwidth shortage in physical servers. (10) ensures that each software container is placed in a single VM. Constraint (11) involves linearizing the nonlinear equation, that is, \({z}_{\overline{\text{uv}} }={z}_{\text{up}}.{z}_{\text{vp}}\) by addressing the interdependencies among the VMs [19], enabling their integration. (12) was introduced by Rossi et al. [19] to evaluate the network delay among hosting VMs. Network latency within interconnected VMs must adhere to a stipulated critical threshold.

In summary, our placement model's incorporation of linear constraints and dual linear objective functions significantly improves its mathematical stability and ensures convergence. This is because the optimization process explores the constrained parameter space to attain a solution that optimizes both objectives, resulting in a practical method for addressing the placement optimization problem.

4.2 Benchmark model for comparison

In 2019, Rossi et al. [19] proposed considering multiple QoS attributes such as minimizing the application performance penalty, adaptation cost, and resource cost. The original model is formulated as follows:

$$Z(x)={\sum }_{v\in V}{z}_{v}$$
(16)
$$D\left(x\right)={\sum }_{e\in E}{\sum }_{v\in V}{D}_{\text{ev}}\left(x\right)$$
(17)
$${D}_{\text{ev}}(x)=\frac{{\sum }_{i\in I,{I}_{v}}{L}_{i}}{{DR}_{v}}{x}_{\text{evp}}+{T}_{e}{\delta }_{evp}$$
(18)

Minimising

$$ { }\left( {\frac{z\left( x \right)}{{z_{{{\text{max}}}} }} + \frac{D\left( x \right)}{{D_{{{\text{max}}}} }}} \right) $$
(19)
$${\sum }_{v\in V}{x}_{\text{ev}}=1, \forall e\in E, \forall v\in V$$
(20)
$${\sum }_{v\in V}{{c}_{e}x}_{\text{ev}}\le {c}_{v},\forall e\in E, \forall v\in V$$
(21)
$${\sum }_{e\in E}{x}_{\text{ev}}\le {z}_{v}\le {\sum }_{e\in E}{x}_{\text{ev}},\forall e\in E, \forall v\in V$$
(22)
$${d}_{(u,v)}{z}_{\overline{\text{uv}}}\le N{D }_{\text{max}}, \forall u,v \in V$$
(23)
$${z}_{\text{up}}+{z}_{\text{vp}}\le {z}_{\overline{\text{uv}}}\le \frac{{z }_{\text{up}}+{z}_{\text{vp}}}{2},\forall u,v\in V$$
(24)
$${\delta }_{\text{ev}}\ge {x}_{\text{ev}}-{x{\prime}}_{\text{ev}},\forall e\in E, \forall v\in V$$
(25)
$$ \delta _{{{\text{ev}}}} \le \frac{{(1 - x^{\prime } _{{{\text{ev}}}} ) + x_{{{\text{ev}}}} }}{2},\forall e \in E,\forall v \in V $$
(26)
$$ \delta_{{{\text{ev}}}} \in \left\{ {0,1} \right\},{ }\forall { }e \in { }E,{ }\forall { }v \in { }V $$
(27)
$$ z_{{\overline{{{\text{uv}}}} }} \in \left\{ {0,1} \right\},{ }\forall { }v,u \in { }V $$
(28)
$$ x_{{{\text{ev}}}} \in \left\{ {0,1} \right\},{ }\forall { }e \in { }E,{ }\forall { }v \in { }V $$
(29)
$$ x_{{{\text{ev}}}}^{\prime } \in \left\{ {0,1} \right\},{ }\forall { }e \in { }E,{ }\forall { }v \in { }V $$
(30)
$$ z_{v} \in \left\{ {0,1} \right\},{ }\forall { }v \in { }V $$
(31)
$$ z_{v}^{\prime } \in \left\{ {0,1} \right\},\forall { }v \in { }V $$
(32)

Constraint (16) quantifies the number of VMs allocated for the execution of the application. Constraint (17) encompasses the temporal requirements associated with retrieving software container images and initiating software container instances. Constraint (18) represents the temporal demand for deploying software container e onto VM v. Constraint (19) serves as the objective function to minimize both the adaptation time and overall cost incurred by the VMs. A key feature is ensuring that every software container is placed within a single VM (20). Constraint 21 was implemented to address the problem of inadequate CPU resources for software containers housed within VM v. Constraints (22) and (23) were introduced to incorporate the evaluation of network delays among the hosting VMs. The network latency within the interconnected VMs must meet a specified critical threshold. Constraint (24) linearizes the nonlinear equation \({z}_{\overline{\text{uv}} }={z}_{\text{up}}.{z}_{\text{vp}}\), thereby facilitating the incorporation of interdependencies among the VMs. Equations (25), (26) represent the allocation of software container e to time step i concerning its placement at the previous time step (time step i-1). They determined whether the allocation had changed or remained the same [19].

The model incorporates two approaches to address various aspects of QoS. First, it aims to reduce the number of VMs (\({z}_{v}, v \in V\)) by using a binary variable that signifies whether at least, one software container is hosted by a single VM (constraint 22). This approach has practical implications as it can significantly reduce the cost of VMs. Second, it seeks to minimize the time required for the deployment software container to adapt, which can lead to faster and more efficient data center operations.

The primary concern is whether adding and removing VMs within the model can effectively reduce the overall resource costs, or if other factors need to be considered. The second noteworthy concern pertains to the adequacy of solely considering VM's CPU utilization of the VM as a basis for QoS assessment. In this evaluation, we consider other resources such as memory and network bandwidth. To address these concerns, a comparative analysis was conducted against the benchmark model. This analysis not only validates the effectiveness of the proposed model (ROCAMS) in mitigating resource costs but also underscores its potential for cloud computing operations.

5 Proposed approach

This section discusses the aspect of scheduling of hosts in our proposed model (ROCAMS), which is crucial for optimizing performance and resource utilization. We present a systematic approach, starting with the design framework and delving into critical algorithmic factors to ensure dependable solutions. This chapter explores adapted GA and PSO and introduces an innovative Container–EcoBalancer.

5.1 Replacement strategy

The system model encompasses multiple entities that characterize the typical cloud infrastructures. Central to this model is data centers that provide foundational resources for physical server provisioning within the infrastructure layer. These data centers offer different physical server types with varying characteristics. Servers host various VM types, each possessing distinctive attributes related to resource capacity, primarily computational and memory. VMs can host various types of software containers, each with its own specific CPU and memory allocations. Software container replacement or migration is performed for the VMs. Before determining the replacement model, predictions of resource utilization are used to identify overutilized and underutilized servers. These servers can then be grouped into a single cluster or placed between two or more clusters.

Based on the model discussed in the previous section, the following nine essential steps comprise the framework.

Step 1. Physical servers that are either overutilized or underutilized are identified.

Step 2. Software containers with resource utilization levels exceeding 90% are identified.

Step 3. The resources of the identified software containers are predicted. Many studies have been conducted to predict the resource utilization of software containers using ML, and this area of study covers a broad range of topics [20,21,22,23]. Our study utilized multiple deep-learning models, including bidirectional LSTM, CNN LSTM, convolutional LSTM, and stacked LSTM, all implemented using the Keras library.Footnote 1 A more comprehensive analysis remains a topic for future work.

Step 4. The necessity for vertical scaling becomes apparent when specific software containers demonstrate that the substantial resource utilization is evaluated. In such cases, CPU and memory allocation adjustments are made as needed.

Step 5. The consumption of resources by the physical servers using the predicted data is considered. The relationship between the utilization of resources by software containers, and the use of resources by VMs can impact the use of resources by VMs, ultimately affecting the resource consumption of physical servers. The authors of the study by Moghaddam et al. [24] suggested exploring the relationship between the resource consumption of VMs and the host system's resources. A reliable method involves using ML techniques to analyze the interplay between the utilization of resources by software containers and their effect on the usage of VM resources and the influence of VM resource usage on the utilization of physical server resources. In this study, we posit that the summation of software container resource consumption at the VM level signifies the VM resource consumption. In addition, consolidating VM resource consumption at the physical server level denotes the resource consumption of the physical server. Additional investigations of these topics hold the potential to yield a more comprehensive analysis that warrants further examination in academic journals.

Step 6. A set of underutilized or overutilized physical servers is identified. The selected hosts had the same configuration to balance the servers.

Step 7. The proposed replacement model is then solved.

Step 8. The selected software containers are migrated across the VMs.

Step 9. The migration weights (\({w}_{v}\)) of the migrated software containers are increased.

5.2 Algorithm

The mathematical framework detailed in Sect. 4 utilizes methodologies from the domain of applied integer programming. Traditional techniques, such as MILP solvers and dynamic programming, are employed to tackle the proposed model (ROCAMS). The complexity of the problem is NP-hard and poses a practical challenge for problem instances of moderate scale.

A metaheuristic is an algorithm that can be used to solve complex problems. Although they may not always produce an optimal solution, they are highly effective in finding a feasible solution much more quickly, especially when the problem is vast and complex. When proposing a solution for a data center with dependable and high SLA, it is crucial to ensure consistency between adjacent time intervals and minimize discrepancies with the last location. In order to enhance the dependability of the suggested model (ROCAMS) and plan the data center effectively, it is essential to focus on three critical elements. The factors are as follows:

  1. 1.

    The objective function of the proposed model (ROCAMS) defines that selecting the software container with the lowest weight reduces the likelihood of selection for those with larger disk space and increased migration weight.

  2. 2.

    Instead of migrating many software containers, the focus should be identifying and migrating only necessary software containers.

  3. 3.

    By accepting the first acceptable (feasible) solution, the data center can achieve balance within the shortest possible time without unnecessary delays in the decision-making process.

We propose two metaheuristic algorithms to address the proposed mathematical model. The selection of the appropriate algorithm is contingent upon the intrinsic attributes of the problem at hand, harmonizing with the expectations delineated in the preceding sections. An algorithm is conceptualized to meet the previously discussed expectations and consider the complexities inherent in the problem at hand.

5.2.1 Adapting the genetic algorithm

GA is a prominent metaheuristic approach widely employed in various scientific domains. The following paragraphs explain the critical stages involved in customizing the algorithm for the proposed model (ROCAMS), although delving into the algorithm's intricacies may not be necessary (Algorithm 1 in Appendix).

The fundamental building block in this context is a designed chromosome. This chromosome establishes the precise configuration of software containers across the VMs. The VMs are fixed at a maximum limit of four per host. Remarkably, the first four software containers in each host are allocated to VM#1, whereas the subsequent batch is placed on VM#2, etc., which is consistent with this pattern. Remarkably, chromosome length aligned with the model's total software container count. Chromosomes are represented as a list to produce the initial pool of data during the algorithmic process's genesis. The list corresponds to the total number of software containers used in the model. Significantly, the constituents of the list are manifested as integer values, ranging from zero to the overall software container count within the list.

The concept of crossover, as depicted by the constructed chromosome, presents a unique feature of this situation. A typical crossover process cannot be applied in this case. Performing a crossover between two chromosomes can result in some software containers being left out of the new chromosome, whereas other regions experience duplication.

The mutation stage of the GA entails selecting two random values, both shorter than the length of the chromosome (equivalent to the total software container count). The values of the selected elements are subsequently replaced, thereby incorporating diversity into chromosome composition. The mutation value in this study is assumed to be constant (0.25).

5.2.2 Adapted particle swarm optimization

This section explains the customization of a PSO approach (Algorithm 2 in Appendix) to suit the proposed model (ROCAMS). We focus solely on explaining the basic steps required to achieve this objective.

Identifying the locations of swarm instances involves merging the names of the software containers (e) that are situated in special VMs (v). Clarifying the significance of the arrangement and configuration of software containers in VMs is not essential. Software containers such as (\({pod}_{1},{pod}_{2},{pod}_{3},{pod}_{4}\)) and (\({pod}_{1},{pod}_{4},{pod}_{2},{pod}_{3}\)) can be classified as equivalent entities.

The evolution of swarm velocities (it can be inferred as the velocities of birds or bees) is achieved by using the prescribed Eq. (33).

$$\text{Velocity}=w*\text{velocity}+{c}_{1}*{r}_{1}*(\text{bestbirdvelocity}-\text{velocity})+{c}_{2}*{r}_{2}*(\text{bestglobalvelocity}-\text{velocity})$$
(33)

The constants w, \({c}_{1}\), and \({c}_{2}\), which assume values of 0.8, 0.1, and 0.1, respectively, are crucial components of this equation. Additionally, variables \({r}_{1} and {r}_{2}\), which assume random values within the interval (0, 1), play a fundamental role in the velocity updating process. The initial velocity of each bird is equivalent to the global bird velocity.

We propose Eq. 34 to compute the improvement of the algorithm by calculating the summation of CPU utilization distances across the physical servers’ post-execution of replacements and the computation of the average CPU consumption.

$$ Total\;distance = \sum _{{i = 0}}^{{number\;of\;host}} \left| {cpu\;usage\;of\;physical\;server_{i} - average\;of\;cpu\;host\;consumption} \right| $$
(34)

The critical factor in determining the velocity update is the relative performance of each consecutive step. The velocity updating mechanism is triggered when the cumulative CPU utilization distance in step i is shorter than in step i-1. This demonstrates the flexibility and effectiveness of the modified PSO algorithm through its integration with the proposed model (ROCAMS).

5.2.3 Proposed container–ecobalancer algorithm

The proposed methodology, which we call the Container–EcoBalancer, revolves around an algorithmic approach inspired by the principle of gradual cooling found within simulated annealing. This concept is applied to the proposed model's (ROCAMS) design. The initial CPU usage difference between the host and the average of the selected host is considered significant, and the model aims to replace the software containers to fulfill the constraints outlined in the mathematical model, particularly Eqs. 6, 7, to reduce the assumed high temperature of the initial placement. Subsequently, deliberate reduction is applied to the upper and lower bounds, which precipitated the iterative resolution of the model.

This cyclic progression persists until the model converges, achieving an optimal state with minimal feasible upper and lower bounds values. The decrease in these limits, a direct result of the principles of progressive cooling, holds the promise of significantly reducing CPU usage. This mirrors the process employed in simulated annealing.

An initial assumption is made to commence this process, with upper and lower limits established at 1.5 and 0.5, respectively.

In the experimental evaluation, the algorithm is subjected to testing under distinct conditions with fixed upper and lower bounds, specifically ([0.6, 1.4], [0.7, 1.3], and [0.8, 1.2]), in conjunction with a strategy for the gradual reduction in these bounds. This experimental facet underscores the adaptability and efficacy of the algorithm in various scenarios, thereby showing its robustness and potential utility in practical applications.

The algorithm encompasses a series of steps described as follows (Algorithm 3 in Appendix):

  1. 1.

    Initialization and baseline establishment: Container–EcoBalancer starts operating by defining the essential parameters, with the upper and lower limits set at 1.5 and 0.5, respectively. The process also involves calculating the average CPU utilization across the chosen hosts, creating a foundation for subsequent optimization initiatives.

  2. 2.

    Identifying overutilized and underutilized hosts: The algorithm uses historical and current information to identify hosts with higher or lower CPU usage than the calculated average (Sect. 5.1, Steps 5 and 6). Carefully chosen overburdened software containers are meticulously selected (Sect. 5.1, Step 3) in these hosts, and underutilized software containers in the underutilized host are also identified.

  3. 3.

    Calculation of migration weights and sorting: Software container migration is determined by multiplying the allocated migration weight and the disk usage of software containers. These weights are methodically sorted in ascending order to facilitate efficient software container selection for migration operations; meaning software containers with lower migration weights are chosen before those with higher migration weights.

  4. 4.

    Iterative migration and constraint adherence: The algorithm embarks on an iterative journey toward balancing the CPU utilization of hosts, systematically replacing software container pairs, and drawing from lists of overutilized and underutilized software containers. Careful evaluation ensures system stability and adherence to the operational thresholds. An optimal solution is discerned upon meeting these constraints, and an iterative refinement process is concluded.

  5. 5.

    Dynamic bound adjustment for enhanced optimization: Container–EcoBalancer employs a dynamic adjustment mechanism for bounds, gradually reducing the acceptable range of CPU utilization. This refined approach aligns more closely with the calculated average, resulting in an energy-efficient resource allocation while maintaining system integrity.

In brief, the Container–EcoBalancer algorithm integrated with software containers exhibits several noteworthy characteristics and advantages that deserve recognition, such as.

  1. 1.

    Container–EcoBalancer updates the values of the upper and lower bound parameters while solving the replacement model. This update is instrumental in proposing the optimal solution, as it considers constraints 6 and 7 closely related to the average CPU utilization.

  2. 2.

    Within each VM, slightly fewer than half of the software containers are selected for the migration.

  3. 3.

    The algorithm's runtime and complexity are contingent on the number of overloaded and underutilized hosts and software containers in the system.

  4. 4.

    Because of the large number of software containers in a problem with accurate dimensions, the number of moving software containers in each iteration (Steps 7– 9) can be increased.

The key concept behind introducing this algorithm is to replace specific overutilized and underutilized software containers with a lower migration weight than the others to obtain an initial feasible solution and manage CPU utilization within a range around the mean CPU resource utilization. The approach of selecting to migrate the overutilized and underutilized software containers encompasses the optimization objective encapsulated within objective function 1.

Metaheuristic approaches usually integrate various strategic techniques to counter the tendency to become stuck in local optima while tackling complex optimization problems in the quest for global optima. Consequently, applying metaheuristic algorithms can lead to excessive replacement of the proposed solutions using these algorithms. This effort frequently results in significant changes between the current placement and the newly proposed arrangement.

However, expanding the dimensions of the problem, which involves an increase in the number of software containers, VMs, and hosts, presents a simultaneous challenge: an efficient solution to the associated mathematical model in the data center. The main objective is attaining a practical solution efficiently and rapidly while minimizing unnecessary software container replacements.

Finally, a comparative analysis of the efficiency of the Container–EcoBalancer, as presented in Algorithm 3 (Appendix), PSO (Algorithm 2), and GA (Algorithm 1), is conducted.

6 Experimental evaluation

This section describes the simulation setup by employing benchmarks with commercial models to replicate a real-world infrastructure. It also presents the experimental results, demonstrating the efficiency of the proposed model (ROCAMS) and algorithm.

6.1 Simulation setup

A thorough evaluation within a large virtual environment is crucial to assess the proposed model (ROCAMS) and algorithm for the Infrastructure as a Service (IaaS) platform, which offers users scalable computational resources. The practical implementation and evaluation of the proposed model (ROCAMS) and an algorithm for tangible infrastructure pose significant challenges in cost and time. Consequently, a simulation is employed to assess the performance of the proposed model (ROCAMS) and algorithm.

As we embark on our path toward experimentation, our main objective is to create a simulation environment that mirrors the characteristics of genuine data centers.

6.1.1 Infrastructure benchmark

GWDGFootnote 2 is a joint academic IT service provider and competence center established by the Max Planck Society,Footnote 3 and the University of GoettingenFootnote 4 serves as a national supercomputing and AI service center. It provides a diverse range of services for academic users, including platforms for cloud services and virtualization. GWDG operates a large infrastructure, in which users can individually order and deploy various types and configurations of software containers. We use GWDG as our benchmark model due to its resemblance to commercial providers. In our test setup, each cluster contained 25 identical physical servers. Various VM configurations have been used for experimentation, including different types of VMs and application container configurations.

6.1.2 Experimental setup

In this study, we utilized the CloudSimPlusFootnote 5 framework, specifically designed as a simulation toolkit, to replicate the environment of a cluster of cloud data centers in an academic setting. To address this issue, we conceived a solitary category comprising physical servers, a singular variant of a VM, and a unified software container. The experimental setup involved simulating a cloud computing environment using a cluster of GWDG data centers with 25 physical servers (Fig. 1). We assumed 100 VMs and 400 software containers. Each physical server hosts four VMs, and each VM contains four software containers.

Fig. 1
figure 1

Simulation environment

6.1.3 Resource utilization metrics

Tables 1, 2, and 3 list the configurations of the hosts, VMs, and software containers. Section 5 highlights that resource balancing may occur within an individual or multiple clusters. In this experiment, we simulated a cluster of GWDG data centers. The simulation project aims to create resource consumption data for hosts, VMs, and software containers at 10-s intervals. The generated data were saved in a PostgreSQL database for further analysis and inspection.

Table 1 Configuration of the physical servers in the simulator
Table 2 Configuration of the VMs in the simulator
Table 3 Configuration of the software containers in simulator

6.1.4 Resource consumption

The sum of the resources used by software containers at the VM level provides a comprehensive picture of the overall resource utilization of the VM and host. The resulting simulated resource consumption data, covering hosts, VMs, and software containers include the following metrics:

  • Hosts: CPU utilization, memory utilization, bandwidth utilization, and power consumption.

  • VMs: CPU utilization, memory utilization, and network bandwidth utilization.

  • Software containers: CPU utilization, memory utilization, and bandwidth utilization.

This collection of metrics, captured during the simulation, aims to provide a comprehensive representation of the resource utilization patterns found within the cloud data center environment.

6.1.5 Resource consumption behavior

We designated specific resource consumption behaviors for individual VMs to test scenarios with overutilized servers. In this setup, some VMs were configured to experience high resource consumption during a specific time interval (e.g., 8:00 AM to 4:00 PM). In contrast, others were set to experience low resource consumption during the same timeframe. This allowed us to mimic the overloaded and underloaded conditions of each VM within a designated time interval. The simulator provided a fixed output value and introduced variability into the generated data across iterations. Therefore, we multiplied the resource consumption of the VMs by random values between 0.95 and 0.99 to generate more realistic results.

Although these experiments primarily focused on CPU utilization as the central resource, it includes other resources, such as memory and network bandwidth, which is equally relevant in the proposed model (ROCAMS).

6.2 Results of experiments

Twenty-five experiments were designed to simulate the proposed algorithms and address the replacement model. To demonstrate the model's efficiency, we compare the replacement model using ILP solvers, GA, PSO, and the Container–EcoBalancer algorithm with varying strategies. These strategies include a reduction in the distance between the lower and upper bounds, which is explained in Sect. 5.2.3 (Step 11), and constant bounds ([0.6, 1.4], [0.7, 1.3], [0.8, 1.2]), such as the Container–EcoBalancer, which omits Step 11 in Sect. 5.2.3. Owing to the problem size, the ILP results were out of bounds as expected. In addition, a benchmark model (Sect. 4.2) was employed for the comparative analysis. The following paragraph provides an in-depth description of the experimental results:

The framework and algorithm were developed using Python, leveraging its versatility and extensive libraries for an efficient implementation. The experiments were conducted on an HP Pavilion Power Desktop 580 equipped with an Intel® Core™ i7-7700 processor that provided ample computational power. With 16.0 GiB of memory available, this hardware setup enabled rigorous testing and evaluation of the algorithm's performance, mirroring real-world computational environments.

To determine the percentage difference between the developed algorithms, which incorporate constant upper and lower bounds and decreasing interval strategies discussed in Sect. 5.2.3, Step 11, and apply them to solve both the proposed model (ROCAMS) and the benchmark model, and to compare the proposed algorithm with GA and PSO, we utilize Formula 35. Finally, the average percentage difference across all 25 experiments was determined as a benchmark for comparing models and algorithms.

$$ {\text{Percentage difference}} = \frac{{{\text{value}}_{1} - {\text{value}}_{2} }}{{\frac{{{\text{value}}_{1} + {\text{value}}_{2} }}{2}}}*100 $$
(35)

6.2.1 Results using different models

Figures 2, 3 illustrate the power saving rates between the proposed model (ROCAMS) (green line) and the benchmark model (purple line) after solving the model with GA and PSO. The proposed algorithm saves 67.56% more power compared to GA, demonstrating its superior energy efficiency. Additionally, it saves 72.87% more power than PSO, indicating even greater efficiency over PSO. This also implies that PSO is less efficient than GA in terms of power savings. As previously stated, the strategy outlined in the benchmark models, which involves simply reducing the number of VMs and the deployment time for migrated software containers, needs to be revised for efficient system orchestration. As resource usage in a system increases, it becomes crucial to employ more precise strategies to orchestrate the hosts. A direct comparison of the objective function values was not performed because of the different objective functions of the two models.

Fig. 2
figure 2

Comparison of saved power values (kW) between the proposed model (ROCAMS) and the benchmark model using a GA. The blue line indicates the proposed model (ROCAMS) with GA, whereas the green line indicates the benchmark model with GA

Fig. 3
figure 3

Comparison of saved power values (kW) between the proposed model (ROCAMS) and the benchmark model using PSO. The blue line represents the proposed model (ROCAMS) with PSO, while the green line represents the benchmark model with PSO

To ascertain the statistical significance of the differences in power savings percentages between the two models, the Wilcoxon-significance test was conducted at α = 0.01, with the null hypothesis (H1: proposed model- benchmark model > \({\mu }_{0}\)). The results indicate statistically significant evidence for the superior performance of the proposed model (ROCAMS) compared with the benchmark model in both the GA (p_value = 2.9e-8 and normality p_value = 0.007571) and PSO (p_value = 2.9e-8 and normality p_value = 0.3436) results.

6.2.2 Results under different evaluation metrics

Table 4 presents the percentage differences in the objective function, number of replacements, power savings, and runtime when comparing the Container–EcoBalancer with the GA and PSO to solve the proposed model (ROCAMS) to assess its efficiency. This assessment considered different upper and lower bounds: (lower bound = 0.8, upper bound = 1.2), (lower bound = 0.7, upper bound = 1.3), (lower bound = 0.6, upper bound = 1.4), and the approach (Container–EcoBalancer) of decreasing the distance between the lower and upper bounds.

Table 4 Percentage difference between the proposed algorithm, GA, and PSO for solving the proposed model (Container–EcoBalancer)

Figure 4 shows the variance in the objective function across 25 experiments. Based on the proposed multi-objective and benchmark models, objective functions 1 and 2 were combined to form the objective function in all the algorithms, as shown in the results. On average, the percentage differences between the proposed algorithm with constant upper and lower bounds and the GA are as follows: −0.006974% for bounds [0.8, 1.2], −0.007626% for bounds [0.7, 1.3], and −0.008404% for bounds [0.6, 1.4]. The difference between the Container–EcoBalancer and GA is –0.0061602% (Table 4).

Fig. 4
figure 4

Comparison of objective function values across different applied algorithms. The blue line represents the proposed algorithm [0.8, 1.2], the green line represents the proposed algorithm [0.7, 1.3], the red line represents the proposed algorithm [0.6, 1.4], the purple line represents the Container-Eco-Balancer, the orange line represents the GA, and the magenta line represents the PSO

Similarly, compared with the PSO, the proposed algorithm achieved differences of –0.003473% for bounds [0.8, 1.2], −0.004125% for bounds [0.7, 1.3], −0.0049027% for bounds [0.6, 1.4], and –0.002659% (see Table 4). The following explanation can be provided for these findings. By reducing the gap between the upper and lower boundaries, there is an increased need for replacement, which leads to a higher objective function value. However, the proposed algorithm effectively manages the model with fewer replacements than PSO and GA.

Figure 5 illustrates the differences in runtime between the algorithms. Notably, decreasing the gap between the upper and lower bounds increased computational time, suggesting that the algorithm found solutions with more replacements.

Fig. 5
figure 5

The run time of applied algorithms. The blue line represents the proposed algorithm [0.8, 1.2], the green line represents the proposed algorithm [0.7, 1.3], the red line represents the proposed algorithm [0.6, 1.4], the purple line represents the Container-Eco-Balancer, the orange line represents the GA, and the magenta line represents the PSO

Under the given conditions, the proposed algorithm demonstrated a reduction in solution time of 52.86% ([0.8, 1.2]), 59.8% ([0.7, 1.3]), and 68.71% ([0.6, 1.4]) relative to the GA. Moreover, the Container–EcoBalancer solved the proposed model (ROCAMS) 45.12% slower than the GA, as indicated in Table 4. The proposed algorithm, which features constant upper and lower bounds, is notably faster than PSO. Specifically, it achieved improvements in processing time of 57.63%, 63.94%, and 72.01% over PSO, along with a 33.64% (Table 4) reduction in speed when using Container–EcoBalancer.

This demonstrates that implementing the strategy of reducing the interval distance in Container–EcoBalancer leads to an increase in the runtime of the solving algorithm, as opposed to defining the constant upper and lower bounds.

Figure 6 provides insights into the number of replacements made by different algorithms. The results of a comparison between the proposed algorithm and two other algorithms showed that the former achieved lower replacement rates. The proposed algorithm exhibits significant percentage differences in comparison with GA, with reductions of 98.34% ([0.8, 1.2]), 112.09% ([0.7, 1.3]), 130.44% ([0.6, 1.4]), and 87.73% (Container–EcoBalancer, refer to Table 4) in the number of replacements. Similarly, when contrasted with PSO, the proposed algorithm, when constrained within specified bounds, demonstrates reductions of 63.01% ([0.8, 1.2]), 79.58% ([0.7, 1.3]), and 102.78% ([0.6, 1.4]), while Container–EcoBalancer also realizes a reduction of 50.68% (refer to Table 4) in the number of replacements. Reducing the interval between the lower and upper bounds results in more frequent replacements, ensuring that host CPU usage values fall within a specified interval.

Fig. 6
figure 6

Number of software container replacement in applied algorithms. The blue line represents the proposed algorithm [0.8, 1.2], the green line represents the proposed algorithm [0.7, 1.3], the red line represents the proposed algorithm [0.6, 1.4], the purple line represents the Container-Eco-Balancer, the orange line represents the GA, and the magenta line represents the PSO

Figure 7 shows the differences in the power savings across the 25 experiments. On average, the proposed algorithms with specified bounds achieved 32.4% ([0.8, 1.2]), −6.87% ([0.7, 1.3]), −41.63% ([0.6, 1.4]), and Container–EcoBalancer 60.13% (Table 4) more power savings than the GA. Furthermore, in comparison with PSO, the proposed algorithm with constant upper and lower bounds could save 33.16% ([0.8, 1.2]), −5.48% ([0.7, 1.3]), and −38.14% ([0.6, 1.4]), and the Container–EcoBalancer 61.78% (Table 4). These findings highlight the importance of accurately defining the interval between the upper and lower limits for implementing more successful energy-saving techniques.

Fig. 7
figure 7

The value of saved power (kW) in applied algorithms. The blue line represents the proposed algorithm [0.8, 1.2], the green line represents the proposed algorithm [0.7, 1.3], the red line represents the proposed algorithm [0.6, 1.4], the purple line represents the Container-Eco-Balancer, the orange line represents the GA, and the magenta line represents the Particle Swarm Optimization (PSO)

In addition, the Container–EcoBalancer algorithm, which narrows the interval between the lower and upper bounds, is more efficient in reducing the total power consumption than other methods.

To assess the significance of the differences in power savings between the proposed algorithm, GA, and PSO, Wilcoxon-significance tests, T tests, and Mann–Whitney U-tests are conducted at a significance level of α = 0.01 and 0.05. The results indicate statistically significant differences in performance between the Container–EcoBalancer and the compared algorithms for both the GA and PSO results (Tables 5, 6). Additionally, the values of the objective function and number of replacements all exhibited significance at the 0.01 and 0.05 levels.

Table 5 Significant test of power saving between Container–EcoBalancer and GA
Table 6 Significant test of power saving between Container–EcoBalancer and PSO

Concerning the effectiveness of the proposed algorithm within the specified lower and upper bounds [0.8,1.2] compared with the GA and PSO, it can be stated that the proposed algorithm achieved higher power savings with fewer software container replacements. This makes the proposed algorithm more attractive than both PSO and GA.

Conversely, the designed PSO consistently provides superior solutions compared to the adapted GA, which means that it can solve the model with fewer replacements. This highlights the significant roles played by the defined parameters "velocity" and "total distance" (Eqs. 33, 34) in establishing the PSO's superiority over GA.

6.2.3 Comparative analysis of algorithm convergence and diversity

In addition, to assess the performance of the three algorithms (Container–EcoBalancer, GA, and PSO) in multi-objective optimization, all algorithms showed a perfect representation of the Pareto front, as demonstrated by both the hypervolume and Inverted Generational Distance (IGD) metrics, which yielded a value of 0.0. This indicates that each algorithm effectively covers the entire objective space and accurately represents the trade-offs between the conflicting objectives. These findings suggest the three algorithms' high level of effectiveness in generating diverse and optimal solutions for a given problem.

7 Conclusion

To address the critical issue of improving power consumption in cloud data centers by utilizing virtualization and containerization technologies, we introduced a novel software container replacement model based on ILP. This model incorporates two distinct objective functions and ten constraints. The primary objective function prioritizes the selection of software containers with lower resource demands. In contrast, the secondary objective function focuses on minimizing the summation of resource utilization disparities (specifically, the average CPU, memory, and network bandwidth) between the VMs and their hosting physical servers. Our proposed model (ROCAMS) enforces constraints to maintain CPU utilization levels close to the average CPU usage while protecting against memory and network bandwidth shortages in both VMs and physical servers. User-defined lower and upper bounds control the degree of proximity to the mean CPU utilization.

Through experimentation, we compared the outcomes of our proposed model (ROCAMS) with a benchmark model. The results demonstrate that our model achieves power savings exceeding 61% compared with the benchmark model. We presented an efficient algorithm to expedite the solution process for the replacement model with minimal software container replacement. Compared to alternative GA-based and PSO-based models, our proposed model (ROCAMS) shows remarkable reductions in power consumption, surpassing 67.56% for GA and 72.78% for PSO. Additionally, the Container–EcoBalancer algorithm shows significant power consumption reductions, surpassing 60.13% for GA and 61.78% for PSO. Of notable significance in the context of Container-Eco-Balancer algorithms is the reduction in the number of replacements. In our experiments, the proposed algorithm consistently exhibited a reduction of 87.73 and 50.68% in replacements compared with the applied GA and proposed PSO, respectively.

Furthermore, we conducted experiments with varying upper and lower bound ranges. These experiments revealed that reducing disparities in average CPU utilization led to increased replacements and longer algorithm runtimes.

Looking ahead to future research endeavors, we intend to explore the simultaneous replacement of containers with VMs. This holistic approach enabled us to investigate the combined effects of replacement strategies on power savings. In addition, we plan to develop an advanced algorithm capable of predicting resource utilization trends over extended durations. Such a predictive algorithm will provide valuable insights for optimizing resource allocation and enhancing overall power efficiency.

The outcomes of the custom-designed PSO reveal its superior efficiency in solving the model compared to the adapted GA. Further investigations into adding additional steps and refining the designed PSO for the model may be promising avenues for future research.

In summary, our study contributes to reducing power consumption in cloud data centers by introducing a robust software container replacement model and an efficient algorithm. Our results highlight the proposed model's (ROCAMS) and proposed algorithm’s superior performance in terms of achieving significant power savings.