A systematic literature review for load balancing and task scheduling techniques in cloud computing

Devi, Nisha; Dalal, Sandeep; Solanki, Kamna; Dalal, Surjeet; Lilhore, Umesh Kumar; Simaiya, Sarita; Nuristani, Nasratullah

doi:10.1007/s10462-024-10925-w

A systematic literature review for load balancing and task scheduling techniques in cloud computing

Open access
Published: 05 September 2024

Volume 57, article number 276, (2024)
Cite this article

Download PDF

You have full access to this open access article

Artificial Intelligence Review Aims and scope Submit manuscript

A systematic literature review for load balancing and task scheduling techniques in cloud computing

Download PDF

Nisha Devi¹,
Sandeep Dalal¹,
Kamna Solanki²,
Surjeet Dalal³,
Umesh Kumar Lilhore⁴,
Sarita Simaiya⁴ &
…
Nasratullah Nuristani⁵

1032 Accesses
Explore all metrics

Abstract

Cloud computing is an emerging technology composed of several key components that work together to create a seamless network of interconnected devices. These interconnected devices, such as sensors, routers, smartphones, and smart appliances, are the foundation of the Internet of Everything (IoE). Huge volumes of data generated by IoE devices are processed and accumulated in the cloud, allowing for real-time analysis and insights. As a result, there is a dire need for load-balancing and task-scheduling techniques in cloud computing. The primary objective of these techniques is to divide the workload evenly across all available resources and handle other issues like reducing execution time and response time, increasing throughput and fault detection. This systematic literature review (SLR) aims to analyze various technologies comprising optimization and machine learning algorithms used for load balancing and task-scheduling problems in a cloud computing environment. To analyze the load-balancing patterns and task-scheduling techniques, we opted for a representative set of 63 research articles written in English from 2014 to 2024 that has been selected using suitable exclusion-inclusion criteria. The SLR aims to minimize bias and increase objectivity by designing research questions about the topic. We have focused on the technologies used, the merits-demerits of diverse technologies, gaps within the research, insights into tools, forthcoming opportunities, performance metrics, and an in-depth investigation into ML-based optimization techniques.

Cloud resource allocation schemes: review, taxonomy, and opportunities

Article 12 May 2016

A Comprehensive Study on Cloud Computing: Architecture, Load Balancing, Task Scheduling and Meta-Heuristic Optimization

A comprehensive examination of load balancing algorithms in cloud environments: a systematic literature review, comparative analysis, taxonomy, open challenges, and future trends

Article 24 April 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The surge in IoT device usage has led to the emergence of cloud computing as a significant research focus. It offers a variety of services in many different application areas, with the highest level of flexibility and scalability. The high growth of information and communication technologies (ICT) has resulted in integrating big data with the IoT, revolutionizing cloud services. Within this transformative framework, cloud computing is pivotal in enabling efficient and scalable solutions for managing big data. Numerous cloud service providers enable organizations to obtain the optimal software, storage, and hardware facilities needed to accomplish their goals at a much more affordable cost. Customers subscribe to the services they require under the cloud computing paradigm and sign a service level agreement (SLA) with the cloud vendor, outlining the quality of service (QoS) and conditions of service provision. Table 1 presents the service control that the various cloud service models offer to end-users. Load balancing is a method that distributes tasks among virtual machines (VMs) using a Virtual Machine Manager (VMM). It assists in handling different types of workloads, such as CPU, network, and memory demands (Buyya 2018) (Mishra and Majhi 2020). The cloud computing infrastructure has three significant challenges: virtualization, distributed frameworks, and load balancing. The load-balancing problem is defined as the allocation of workloads among the processing modules. In a multi-node environment, it is quite probable that certain nodes will experience excessive workload while others will remain inactive. Load unbalancing is a harmful event for cloud service providers (CSPs), as it diminishes the dependability and effectiveness of computing services while also putting at risk the quality of service (QoS) guaranteed in the service level agreement (SLA) between the customer and the cloud service provider (Oduwole et al. 2022). Verma et al. (2024) introduced a load-balancing methodology, utilizing genetic algorithms (GA), to improve the quality of the telemedicine industry by efficiently adapting to changing workloads and network conditions at the fog level. The flexibility to adapt can enhance patient care and provide scalability for future healthcare systems. Walia et al. (2023) cover several emerging technologies in their survey, including Software-Defined Networking (SDN), Blockchain, Digital Twins, Industrial IoT (IIoT), 5G, Serverless computing, and quantum computing. These technologies can be incorporated with the current fog/edge-of-things models for improved analysis and provide business intelligence for IoT platforms. Adaptive resource management strategies are necessary for efficient scheduling and decision-offloading due to the infrastructural efficiency of these computing paradigms.

Table 1 Service control offered to end-users by the various cloud service models

Full size table

1.1 Need for load balancing, factors affecting and associated challenges

Intelligent Computing Resource Management (ICRM) is rapidly evolving to meet the increasing needs of businesses and sectors, driven by the proliferation of Internet-based technologies, cloud computing, and cyber-physical systems. With the rise of information-intensive applications, artificial intelligence, cloud computing, and IoT, intelligent computing monitoring and resource allocation have become crucial (Biswas et al. 2024). Cloud data centers typically need to be optimized because they are built to handle hundreds of loads, which could result in low resource utilization and energy waste. The goals of load balancing include reduced job execution times, optimal resource utilization, and high system throughput. Load balancing reduces the overall resource waiting time and avoids resource overload (Apat et al. 2023). In terms of the equilibrium load distribution, load balancing between virtual machines (VMs) is an NP-hard problem. The difficulty of this problem can be determined by taking two elements into account: huge solution spaces and polynomial-bounded computing. The load can be characterized as under-load, overloaded, or balanced in a cloud computing environment. Identifying overloaded and under-loaded nodes and then distributing the load across them is critical to load balancing (Santhanakrishnan and Valarmathi 2022). With the emergence of technology, many challenges have also ushered in a sequence. These challenges include storage capacity, high processing speed, low latency, fast transmission, load balancing, efficient routing, cost efficiency, etc. Load balancing is a crucial optimisation procedure in cloud computing, and achieving this objective depends on dynamic resource allocation. Some factors that affect load balancing in cloud computing are as follows:

Workload patterns: The variating workload, unpredictable traffic patterns, and heterogeneous applications may affect the efficiency of the cloud system.
Geographical distribution: The cloud data centres are generally located in remote areas that contribute to transmission delays. So, fog computing and edge computing are required to reduce these delays. We must efficiently manage the limited resources of the fog and edge devices.
Cost and budget constraints: Cost considerations have a big impact on load-balancing strategies. It frequently aims to use less expensive resources or minimize idle assets.
The dynamic nature of applications and monitoring necessitates the elasticity and scalability of cloud services. In addition, inadequate monitoring makes it challenging to balance the load.
SLA agreements and breaches: SLA violations are impacted by the services offered by cloud service providers. It is quite necessary to maintain the quality without compromising other factors like throughput, makespan, energy consumption, and cost.
Virtual Machine (VM) Migrations: An increase in the number of VM migrations leads to a decrease in service quality. While VM migration can be beneficial to some extent, its frequency can lead to an increase in time complexity. It takes a lot of time to transfer data from one VM to another, including copying memory pages to the host machine.
Resource availability: Insufficient resources, such as CPU, memory, or bandwidth, limit the load balancing efficiency.
Energy consumption is a critical factor in data centers. Load balancing is very necessary to reduce energy consumption by migrating VMs from overloaded resources to underloaded hosts.

Other factors like fault tolerance, predictive analytics, network latency and data security also affect load balancing in a cloud system. We have divided the technologies reviewed through this SLR into five categories: conventional/traditional, heuristic, meta-heuristic, ML-Centric and Hybrid. Traditional approaches to cloud computing resource allocation and load balancing are time-consuming, unable to yield fast results, and frequently trapped in local optima (Mousavi et al. 2018). In different cloud systems, where resource requirements are estimated at runtime, static load balancing algorithms might not be successful. Dynamic load balancing algorithms, like ESCE and Throttled mechanism, analyse resource requirements and usage during runtime, yet they may result in extra costs and overhead. Traditional algorithms often struggle to scale with the size and complexity of problems. Several articles explore traditional task scheduling algorithms, including Min-min, First come-first serve (FCFS), and Shortest-job-first (SJF). These algorithms are not used often due to their slow processing and time-consuming behaviour. To overcome the issue of conventional methods, a heuristic approach came into the area of research. Kumar and Sharma (2018) propose a resource provisioning and de-provisioning algorithm that outperforms FCFS, SJF, and Min-min in terms of makespan time and task acceptance ratio. However, the priority of tasks is poorly considered, highlighting a limitation in task allocation strategies. Heuristic algorithms demonstrate remarkable scalability. They are highly suitable for handling large-scale optimisation challenges in various industries, including manufacturing, banking, and logistics, due to their efficiency in locating approximate solutions, even in enormous search spaces (Mishra and Majhi 2020). Kumar et al. (2018) presented another heuristic method named ‘Dynamic Load Balancing Algorithm with Elasticity’, showcasing reduced makespan time and increased task completion ratio. Dubey et al. (2018) introduced a Modified Heterogeneous Earliest Finish Time (HEFT) algorithm, demonstrating improved server workload distribution to reduce makespan time. While promising, both studies lack comprehensive performance evaluations and limitedly address other Quality of Service (QoS) metrics, such as response time and cost efficiency. Hung et al. (2019) proposed an Improved Max–min algorithm, achieving the lowest completion and optimal response times. It outperformed the conventional RR, max–min and min-min algorithms.

The development of meta-heuristic algorithms aimed to address the shortcomings of heuristic algorithms, which typically produce approximate rather than ideal solutions. Hybrid techniques have gained traction in recent years, combining heuristic, traditional, and machine-learning approaches. Mousavi et al. (2018) propose a hybrid technique combining Teaching Learning-Based Optimization (TLBO) and Grey Wolf Optimization (GWO), achieving maximized throughput without falling into local optima. Similarly, Behera and Sobhanayak (2024) propose a hybrid GWO-GA algorithm, outperforming GWO, GA (Rekha and Dakshayini 2019), and PSO in terms of makespan, cost, and energy consumption. Further, we have also discussed the cloud and fog architecture and its working principles in the upcoming sections.

1.2 Motivation for the study

The Industrial Internet of Things (IIoT) has experienced significant advancement and implementation due to the quick progress and use of artificial intelligence techniques. In Industry 5.0, the hyper-automation process involves the deployment of intelligent devices connected to the Industrial Internet of Things (IIoT), cloud computing, smart robots, agile software, and embedded components. These systems can leverage the Industry 5.0 concept, which generates massive amounts of data for hyper-automated communication across cloud computing, digital transformation, human sectors, intelligent robots, and industrial production. Big data management requires cloud and fog technology (Souri et al. 2024). Similarly, telemedicine, facilitated by fog computing, has revolutionized the healthcare industry by providing remote access to medical treatments. However, ensuring minimal latency and effective resource utilization are essential for providing high-quality healthcare (Verma et al. 2024). Big data in the industrial sector is crucial for predictive maintenance, enabling informed decisions and enhancing task allocation in Industry 4.0, thus necessitating a proficient resource management system (Teoh et al. 2023). The growing demand for load balancing in various industries using cloud/fog services prompted us to contemplate and inspired us to compose an evaluation of the escalating necessity for resource management technologies. This review’s core contribution is to provide insights into innovative algorithms, their weaknesses and strengths, used dataset details, simulation tools, research gaps, and future research directions.

1.3 Objectives of the SLR

After a detailed review of the selected studies, we observe the following objectives:

Systematically categorise and identify different load balancing and task scheduling algorithms used in cloud computing.
To address fundamental research questions, such as the effectiveness of different algorithmic approaches, simulation tools, metrics evaluation, etc.
To analyse trends and patterns in the literature, such as the prevalence of Meta-heuristic, Hybrid, and ML-centric approaches, and identify any shifts or emerging paradigms in algorithm design.
To conduct a comparative analysis of the different algorithm categories, identifying strengths, weaknesses, research limitations and trade-offs between them.
Lay the groundwork for future technological advancements by identifying areas where further research and development are needed.

1.4 Research contributions of the SLR

Through this SLR, we have attempted to contribute the following insights, which are based on authentic, selected study material:

We have examined selected articles to identify the research patterns and technological advancements related to resource load balancing in cloud computing. We have devised research questions and attempted to ascertain their solutions.
Using this SLR, we presented a taxonomy of algorithms that provide solutions to the chosen problem.
We provided an in-depth examination of the limitations and advantages of different strategies, along with a thorough comparison study of the techniques discussed in Table 5, Table 7, and Table 8.
We have discussed the performance metrics related to load balancing and task scheduling in the cloud system. We have also explored the simulation tools that the authors in this field prefer.
We have tabulated some benchmarked datasets (Table 6) utilized by various authors to achieve several performance metrics.
Finally, we compiled the research gaps and potential areas for future research.

The paper is structured in nine sections, as shown in Fig. 1 above.

2 Methodology of the systematic literature review

This section lays out the components of a systematic literature review, including the search criteria, review methodology, and research questions. This process involves defining research questions or objectives, identifying relevant databases and sources, and systematically searching and screening for eligible studies. The search term constitutes a string encompassing all essential keywords in the research questions and their corresponding synonyms.

2.1 Search criteria and quality assessment:

The keywords utilized to form the search strings are “load balancing”, “task scheduling”, “cloud computing”, and “machine learning.” To extract relevant papers, the below advanced search query was used in Scopus Database:

The various computer science publication libraries were manually searched. The SLR search was conducted using the Scopus database, IEEE Computer Society, ResearchGate, Science Direct, Springer, and ACM Digital Archive.

A total of 550 papers were found initially using the above-mentioned advanced query. Then we applied the Inclusion–exclusion criteria provided in Table 2. Approximately 122 papers were excluded based on having zero citations or requiring purchase to access. We have incorporated cross-referenced studies to obtain a more comprehensive and quality analysis. We manually chose 35 cross-references from the extracted set that strictly adhered to the search criteria to encompass a broader range of reliable studies. A comprehensive selection of 96 papers was finalised, comprising 63 research articles exclusively considered for the technological survey.

Table 2 Inclusion–exclusion criteria for filtering the relevant articles for SLA

Full size table

2.2 Inclusion–exclusion criteria

The criterion for accepting or rejecting a research paper for the study is explained in Table 2 below.

Data extraction has been performed to capture key information from each study, such as design, methods or techniques, research limitations, future scope, tools, evaluation metrics, and other significant findings. This captured information was then synthesized and analyzed through a systematic and structured approach and placed in a tabular format to provide insights and draw conclusions about the research questions.

2.3 Research questions:

This study aims to search for answers to the following research issues by investigating, comprehending, and evaluating the methods, models, and algorithms utilized to achieve task scheduling and load balancing.

1.
What are the current load balancing and task scheduling techniques commonly used in cloud computing environments?
2.
What are the key factors influencing the performance of load-balancing mechanisms in cloud computing?
3.
Which evaluation matrices are predominantly utilized for assessing the efficacy of load-balancing techniques in cloud computing environments?
4.
Which categories of algorithms are used more in the recent research trend in the cloud computing environment for solving load balancing issues??
5.
Which simulation software tools have garnered prominence in recent scholarly analyses within the domain of cloud computing research?
6.
What insights do the future perspectives within the reviewed literature offer in terms of potential avenues for exploration and advancement within the field?

This next section explores the working principle and architecture of cloud computing, which consists of fog and IoT application layers.

3 Cloud-fog architecture and relevant frameworks

Cloud architecture represents a centralized infrastructure that broadens the scope of cloud computing functionalities towards the network’s edge. It leverages fog computing, an intermediate layer between cloud servers and end devices, to enable real-time processing, data storage, and analytics closer to the data source. Fog nodes, deployed at the network edge, play the role of mediators linking end devices and the cloud, thus reducing latency and bandwidth consumption. These nodes can be physical or virtual entities, such as routers, switches, gateways, or even edge servers.

3.1 Working principles

The working principles of cloud architecture involve collaboration between cloud servers, fog nodes, and end devices, creating a distributed computing environment. An end device initiates a request, which first passes through the nearest fog node. The fog node performs initial processing, filtering, and aggregation of the data before sending a subset of it to the cloud for further analysis or storage. By offloading some processing tasks to the fog nodes, cloud-fog architecture reduces the burden on the cloud, improves response times, and enhances the overall system performance. During task execution, dynamic cloud load balancing techniques assign tasks to virtual machines and adjust the load on these machines based on the system’s conditions. (Tawfeeg et al. 2022). Alatoun et al. (2022) presented an EEIoMT framework for critical task execution in the shortest time in smart medical services while balancing energy consumption with other tasks. The authors have utilized ECG sensors for health monitoring at home. Similarly, Swarna Priya, et al (2020) have proposed an energy-efficient framework known as the ‘EECloudIoE framework’ for retrieving information from the IoE cloud network. The authors have adopted the ‘Wind Driven optimization algorithm’ to form clusters of sensor nodes in the IoE network. Then, the Firefly algorithm is utilized to select the ‘cluster head’ (CH) for each cluster. Sensor nodes in sensor networks are also used to track physical events in cases of widely dispersed geographic locations. These nodes assist in gathering crucial data from these sites over extended periods; however, they have problems with low battery power. Therefore, it is essential to implement energy-efficient systems using wireless sensor networks to collect this data. Still, cloud computing has some limitations, such as geographical locations of cloud data centers, network connectivity with end nodes, weather conditions, etc. To overcome these issues, Fog computing emerged as a solution. Fog computing acts as an arbitrator between end devices and Cloud Computing, providing storage, networking, and computation services closer to edge devices. The introduction of Edge Computing has brought about the emergence of various computing paradigms, such as Mobile Edge Computing (MEC) and Mobile Cloud Computing (MCC). The MEC primarily emphasizes a 2- or 3-tier application in the network and mobile devices equipped with contemporary cellular base stations. It improves the efficiency of networks by optimizing content distribution and facilitating the creation of applications (Sabireen and Neelanarayanan 2021). Figure 2 shows how the cloud, fog, and IoT layers work in collaboration.

3.2 Cloud computing layer

Cloud computing facilitates virtualization technology, which combines distributed and parallel processes. Using centralized data centers, it transfers computations from off-premises to on-premises. It has become an advanced technology within the swiftly expanding realm of computing paradigms owing to these two principles: (1) ‘Dynamic Provisioning’ and (2) ‘Virtualization Technology’ (Tripathy et al. 2023). Dynamic provisioning is a fundamental concept in the realm of cloud computing. It refers to the automated process of allocating and adjusting computing resources to meet the changing needs of cloud-based applications and services. Virtual network embedding is essential to load balancing in cloud computing as it ensures the mapping of virtual network requests onto physical resources in an effective and balanced manner. By effectively embedding virtual networks onto physical machines, load-balancing algorithms can divide network traffic and workload evenly across the network infrastructure, preventing any single resource from becoming overloaded. Virtual network embedding may be utilized with load-balancing strategies like least connections, weighted round-robin, and round-robin to maximize resource usage and network performance (Apat et al. 2023; Santhanakrishnan and Valarmathi 2022).

3.3 Fog computing layer

Cisco researchers first used the term fog computing in 2012 to address the shortcomings of cloud computing. To offer fast and reliable services to mobile consumers, fog computing enhances their experiences by introducing a middle fog layer between consumers and the cloud. It is an improvement over cloud-based networking and computing services. The architecture of fog computing consists of a fog server as a fog device or fog node deployed in the proximity of IoT devices to provide resources for different applications. As a promising concept, fog computing introduces a decentralized architecture that enhances data processing capabilities at the network’s edge (Goel and Tiwari 2023). However, the limited resources in the fog computing model undoubtedly make it difficult to support several services for these Internet of Things applications. A prompt choice must be made regarding load balancing and application placement in the fog layer due to the diverse and ever-changing nature of application requests from IoT devices. Therefore, it is crucial to allocate resources optimally to maintain service continuity for end customers (Vergara et al. 2023). Unlike cloud computing, fog utilizes distributed computing with devices near clients with good computing capacity and diverse organizations for global connectivity. Mahmoud et al. (2018) introduced a new fog-enabled cloud IoT model by observing that cloud IoT is not the best option in situations where energy usage and latency are important considerations, such as the healthcare sector, where patients need to be monitored in real-time without delay. The energy allocation method used to load jobs into a fog device serves as the foundation for the entire concept. Table 3 presents a comparison between the features of cloud and fog computing paradigms.

Table 3 Comparison of features of cloud computing and fog computing paradigm (Swarna Priya, et al 2020; Goel and Tiwari 2023; Vergara et al. 2023)

Full size table

3.4 IoT applications layer

Cloud-fog architecture finds applications in various domains, including IoT, healthcare (Alatoun et al. 2022), transportation, smart cities, and industrial automation (Dogo et al. 2019). Healthcare providers can leverage fog nodes for real-time patient monitoring, while industrial automation systems can benefit from edge analytics for predictive maintenance. Telemedicine, smart agriculture and industry 4.0 and 5.0 are other areas that employ IoT applications. Edge computing and cloud computing have given rise to additional computing paradigms such as mobile edge computing (MEC) and mobile cloud computing (MCC). The MEC primarily emphasizes a network architecture that includes a 2- or 3-tier application, and mobile devices equipped with modern wireless base stations. It improves network efficiency, as well as the dissemination of application content (Sabireen and Neelanarayanan 2021).

4 Literature review on load balancing (LB) and task scheduling

We have curated a representative collection of 63 research articles for a technology review. The literature review covers the period from 2014 to 2024. The main target of the LB is to spread the workload on available assets and optimize the overall turnaround time. Before 2014, traditional methods such as FCFS, SJF, MIM-min, Max–min, RR, etc., were recognized for their poor processing speeds and time-consuming job scheduling and load balancing systems. Konjaang et al. (2018) examine the difficulties associated with the conventional Max–Min algorithm and propose the Expa-Max–Min method as a possible solution. The algorithm prioritizes cloudlets with the longest and shortest execution times to schedule them efficiently. The workload can be divided into memory capacity issues, CPU load, and network load. In the meantime, load balancing techniques, with virtual machine management (VMM), are employed in cloud computing to distribute the load among virtual machines (Velpula et al. 2022). In 2019, Hung et al. (2019) introduced an enhanced max–min algorithm called MMSIA. The objective of the MMSIA algorithm is to improve the completion time in cloud computing by utilizing machine learning to cluster requests and optimize the utilization of virtual machines. The system allocates big requests to virtual machines (VMs) with the lowest utilization percentage, improving processing efficiency. The approach integrates supervised learning into the Max–Min scheduling algorithm to enhance clustering efficiency. Kumar et al. (2018) state that the updated HEFT algorithm creates a Directed Acyclic Graph (DAG) for all jobs submitted to the cloud. It also assigns computation costs and communication edges across processing resources.

The ordering of tasks is determined by their execution priority, which considers the average time it takes to complete each work on all processors and the expenses associated with communication between predecessor tasks. Subsequently, the tasks are organized in a list according to their decreasing priority and assigned to processors based on the shortest execution time. In the same way, Seth and Singh (2019) propose the Dynamic Heterogeneous Shortest Job First (DHSJF) model as a solution for work scheduling in cloud computing systems with varying capabilities. The algorithm entails the establishment of a heterogeneous cloud computing environment, the dynamic generation of cloudlet lists, and the analysis of workload and resource heterogeneity to minimize the Makespan. The DHSJF algorithm efficiently schedules dynamic requests to various resources, resulting in optimized utilization of resources. This method overcomes the limitations of the conventional Shortest Job First (SJF) method. A task scheduling process is shown graphically in Fig. 3.

Another technique that many authors increasingly employ is GWO. The GWO technique correlates the duties of grey wolves with viable solutions for distributing jobs or equalizing workloads inside a network or computing system. The Alpha wolves lead the pack, representing the most optimal solution achieved up to this point. The Alpha receives assistance in decision-making and problem-solving from the Beta and Delta wolves, who represent the second and third most optimal alternatives, respectively. The omega wolves, who stand for the remaining solutions, are inspired by the top three wolves. The algorithm represents the exploration and exploitation stages in pursuing the optimal solution through a repetitive process of encircling, hunting, and attacking the target. In 2020, Farrag et al. (2020) published a work that examines the application of the Ant-Lion optimizer (ALO) and Grey wolf optimizer (GWO) in job scheduling for Cloud Computing. The objective of ALO and GWO is to optimize the makespan of tasks in cloud systems by effectively dividing the workload. Although ALO and GWO surpass the Firefly Algorithm (FFA) in minimizing makespan, their performance relative to PSO varies depending on the specific conditions. Reddy et al. (2022) introduced the AVS-PGWO-RDA scheme, which utilizes Probabilistic Grey Wolf optimization (PGWO) in the load balancer unit to find the ideal fitness value for selecting user tasks and allocating resources for tasks with lower complexity and time consumption. The AVS approach is employed to cluster related workloads, and the RDA-based scheduler ultimately assigns these clusters to suitable virtual machines (VMs) in the cloud environment. Similarly, Janakiraman and Priya (2023) introduced the Hybrid Grey Wolf and Improved Particle Swarm Optimization Algorithm with Adaptive Inertial Weight-based multi-dimensional Learning Strategy (HGWIPSOA). This algorithm combines the Grey Wolf Optimization Algorithm (GWOA) with Particle Swarm Optimization (PSO) to efficiently assign tasks to Virtual Machines (VMs) and improve the accuracy and speed of task scheduling and resource allocation in cloud environments. The suggested system effectively tackles the limitations of previous LB approaches by preventing premature convergence and enhancing global search capability. As a result, it provides several benefits, including improved throughput, reduced makespan, reduced degree of imbalance, decreased latency, and reduced execution time. The combination of GWO with GA, as demonstrated by Behera and Sobhanayak (2024), yields superior results. It provides faster convergence and minimum makespan in large task scheduling scenarios.

At the beginning of 2014, metaheuristic and hybrid-metaheuristic algorithms were used to address cloud computing optimization and load-balancing challenges. Zhan et al. (2014) suggested a load-aware genetic algorithm called LAGA which is a modified version of the genetic algorithm (GA). LAGA employs the TLB model to optimize makespan and load balance, establishing a new fitness function to find suitable schedules that maintain makespan while maintaining load balance. Rekha and Dakshayini (2019) introduced a task allocation method for cloud environments that utilizes a Genetic Algorithm. The purpose of this strategy is to minimize job completion time and enhance overall performance. The algorithm considers multiple objectives, such as energy consumption and quick responses, to make the best decisions regarding resource allocation. The evaluation findings exhibit superior throughput using the proposed approach, indicating its efficacy in task allocation decision-making. In 2023, Mishra and Majhi (2023) proposed a hybrid meta-heuristic technique called GAYA, which combines the Genetic Algorithm (GA) and JAYA algorithm. The purpose of this technique is to efficiently schedule dynamically independent biological data. The GAYA algorithm showcases improved abilities in exploiting and exploring, rendering it a highly viable solution for scheduling dynamic medical data in cloud-based systems. Brahmam and Vijay Anand (2024) developed a model called VMMISD, where they combined a Genetic Algorithm (GA) with Ant Colony Optimization (ACO) for resource allocation. The system also utilizes combined optimization techniques, iterative security protocols, and deep learning algorithms to enhance the efficiency of load balancing during virtual machine migrations. The model employs K K-means clustering, Fuzzy Logic, Long Short-Term Memory (LSTM) networks, and Graph Networks to anticipate workloads, make decisions, and measure the affinity between virtual machines (VMs) and physical machines. Behera and Sobhanayak (2024) also proposed a hybrid approach that combines the Grey Wolf Optimizer (GWO) and Genetic Algorithm (GA). The hybrid GWO-GA algorithm effectively reduces makespan, energy consumption, and computing costs, surpassing conventional algorithms in performance. It exhibits accelerated convergence in extensive scheduling problems, offering an edge over earlier techniques.

The combination of autoscaling and reinforcement learning (RL) has garnered significant attention in recent years due to its ability to allocate resources actively in a calm and focused environment (Joshi et al. 2024). Deep reinforcement learning (DRL) is a promising technique that automates the process of predicting workloads. DRL may make immediate decisions on resource allocation based on real-time monitoring of the system’s workload and performance parameters to effectively fulfil the system’s present demands. Ran et al. (2019) introduced a task-scheduling strategy based on deep reinforcement learning (DRL) in 2019. The working of the DRL-based load balancer is shown in Fig. 4. This method assigns tasks to various virtual machines (VMs) in a dynamic manner, resulting in a decrease in average response time and ensuring load balancing. The technique is examined on a tower server with specific configurations and software tools. It showcases its efficacy in balancing load across virtual machines (VMs) while adhering to service level agreement (SLA) limits. The approach employs deep reinforcement learning (DRL) and deep deterministic policy gradients (DDPG) networks to create optimal scheduling decisions by learning directly from experience without prior knowledge. In addition, Jyoti and Shrimali (2020) employed DRL in their research and proposed a technique called Multi-agent ‘Deep Reinforcement Learning-Dynamic Resource Allocation’ (MADRL-DRA) in the Local User Agent (LUA) and Dynamic Optimal Load-Aware Service Broker (DOLASB) in the Global User Agent (GUA) to improve the quality of service (QoS) metrics by allocating resources dynamically. The method demonstrates enhanced performance in terms of execution time, waiting time, energy efficiency, throughput, resource utilization, and makespan when compared to traditional approaches. Tong et al. (2021) present a new technique for task scheduling using deep reinforcement learning (DRL) that aims to reduce the imbalance of virtual machines (VMs) load and the rate of job rejection while also considering service-level agreement limitations. The proposed DDMTS method exhibits stability and outperforms other algorithms in effectively balancing the Degree of Imbalance (DI) and minimizing job rejection rate. The precise configurations of state, action, and reward in the DDMTS algorithm are essential for its efficacy in resolving task scheduling difficulties using the DQN algorithm.

Double Deep Q-learning has been employed to address load-balancing concerns. Swarup et al. (2021) introduced a method utilizing Deep Reinforcement Learning (DRL) to address job scheduling in cloud computing. Their approach employs a Clipped Double Deep Q-learning algorithm to minimize computational costs while adhering to resource and deadline constraints. The algorithm employs target network and experience relay techniques to maximize its objective function. The algorithm balances exploration and exploitation by using the e-greedy policy. This policy establishes the approach for selecting actions by considering the trade-off between exploration and exploitation. The system chooses actions randomly for exploration or based on Q-values for exploitation, thus maintaining a balance between attempting new alternatives and utilizing existing ones. In the same way, Kruekaew et al. (Mao et al. (2014) employ Q-learning to optimize job scheduling and resource utilization. The suggested method, Multi-Objective ABCQ, integrates the Artificial Bee Colony Algorithm with Q-learning to optimize task scheduling, resource utilization, and load balancing in cloud environments. MOABCQ exhibited superior throughput and a higher Average Resource Utilization Ratio (ARUR) than alternative algorithms. Q-learning enhances the efficiency of the ABC algorithm. Figure 5 presents the hybridisation trend of various techniques observed in the literature review.

Furthermore, the swarm-based technique known as Particle Swarm Optimisation (PSO) is increasingly being adopted by researchers to address challenges related to load balancing in cloud computing. Using PSO, combined with other prominent methods, leads to attaining an ideal solution through extensive investigation and exploration of the search space. Panwar et al. (2019) introduced a TOPSIS-PSO method designed for non-preemptive task scheduling in cloud systems. The approach tackles task scheduling challenges by employing the TOPSIS method to evaluate tasks according to execution time, transmission time, and cost. Subsequently, optimisation is performed using PSO. The proposed method optimises the Makespan, execution time, transmission time, and cost metrics. In 2020, Agarwal et al. (2020) introduced a Mutation-based particle swarm Optimization (PSO) algorithm to tackle issues such as premature convergence, decreased convergence speed, and being trapped in local optima. The suggested method seeks to minimise performance characteristics such as Makespan time and enhance the fitness function in cloud computing. In 2021, Negi et al. (2021) introduced a hybrid load-balancing algorithm in cloud computing called CMODLB. This technique combines machine learning and soft computing techniques. The method employs artificial neural networks, fuzzy logic, and clustering techniques to distribute the workload evenly. The system utilises Bayesian optimization-based augmented K-means for virtual machine clustering and the TOP-SIS-PSO method for work scheduling. VM migration decisions are determined with an interval type 2 fuzzy logic system that relies on load conditions. Although these algorithms demonstrated strong performance, they do not consider the specific type of content used by users. Adil et al. (2022) found that knowledge about the type of content in tasks can significantly enhance scheduling efficiency and reduce the workload on virtual machines (VMs). The PSO-CALBA system categorises user tasks into several content types, such as video, audio, image, and text, using a Support Vector Machine (SVM) classifier. The categorisation begins by selecting file fragments, which are tasks that consist of diverse file fragments of different content types. The initial classification stage involves utilising the Radial Basis Function (RBF) kernel approach to analyse high-dimensional data, which is a big challenge. Pradhan et al. (2022) provided a solution for the issue of handling complicated and high-dimensional data in a cloud setting. To address this challenge, they utilised deep reinforcement learning (DRL) and parallel particle swarm optimisation (PSO). The proposed technique synergistically integrates Particle Swarm Optimisation (PSO) and Deep Reinforcement Learning (DRL) to optimise rewards by minimising both makespan time and energy consumption while ensuring high accuracy and fast execution. The algorithm iteratively enhances accuracy, demonstrating superior performance in dynamic environments, and can handle various tasks in cloud environments. Jena et al. (2022) found that the QMPSO algorithm successfully distributes the workload evenly among virtual machines, resulting in improved makespan, throughput, energy utilisation, and reduced task waiting time. The performance of the hybridisation of modified Particle Swarm Optimisation (MPSO) and improved Q-learning in QMPSO is enhanced by modifying velocity based on the best action generated through Q-learning. The technique employs dynamic resource allocation to distribute tasks among virtual machines (VMs) with varying priorities. This approach aims to minimise task waiting time and maximise VM throughput. This strategy is highly efficient for independent tasks.

Load balancing poses a significant challenge in Fog computing due to limited resources. Talaat et al. (2022) introduced a method called Effective Dynamic Load Balancing (EDLB) that utilises Convolutional Neural Networks (CNN) and Multi-Objective Particle Swarm Optimisation (MPSO) to optimise resource allocation in fog computing environments to maximise resource utilisation. The EDLB system comprises three primary modules: the Fog Resource Monitor (FRM), the CNN-based Classifier (CBC), and the Optimised Dynamic Scheduler (ODS). The FRM system monitors the utilisation of server resources, while the CBC system classifies fog servers. Additionally, the ODS system allocates incoming tasks to the most appropriate server, reducing response time and enhancing resource utilisation. This strategy effectively decreases response time. Comparably, Nabi et al. (2022) presented an Adaptive Particle Swarm Optimisation (PSO)-Based Task Scheduling Approach for Cloud Computing, explicitly emphasising achieving load balance and optimisation. The solution incorporates a technique called Linearly Descending and Adaptive Inertia Weight (LDAIW) to improve the efficiency of job scheduling. The methodology employs a population-based scheduling system that draws inspiration from swarm intelligence. In this technique, particles represent solutions, and their updates are determined by factors such as inertia weight, personal best, and global best. The method can reduce task execution time, increase throughput, and better balance local and global search.

Table 4 gives an overview of the advantages and disadvantages of the state-of-the-art techniques. A comparative analysis of state-of-the-art methods on publicly benchmarked datasets is presented in Table 5.

Table 4 The Table highlights sections and subsections to answer the research questions

Full size table

Table 5 Some state-of-the-art load balancing algorithms with weakness and strength features

Full size table

4.1 Some essential load balancing metrics:

It is evident that meticulous monitoring and analysis of metrics enhance resource utilization, minimize downtime, and ensure a seamless user experience, ultimately boosting overall system reliability and scalability. Several metrics employed for assessing the balance of loads in the cloud are illustrated in Fig. 6.

Throughput: In cloud load balancing, throughput refers to the rate at which a cloud infrastructure can process and serve data or requests. Specifically, it represents the amount of work accomplished within a given time frame, reflecting the efficiency of the system’s ability to handle concurrent user demands. High throughput ensures that data or requests can be processed quickly and reliably, minimising latency and optimising resource utilisation. Throughput (t_p) can be calculated by using the mathematical formula given in Eq. (1) below:
$$ tp = \sum\limits_{(n = 1)}^{j} {(ExT)} $$
(1)
where n is the number of tasks, and ExT is the execution time of the j^th task.
Makespan: Makespan denotes the overall duration needed to finish a specific set of tasks or jobs within a cloud computing environment. Minimum makespan represents the efficiency and performance of the system in handling and processing tasks. It can be calculated with the help of the following formula:
$$ Makespan = Max~(ExTj|~j = 1,~2,~3~...) $$
(2)
In equation (2), ExT_j is the execution time of the j^th virtual machine. A robust and efficient load balancing algorithm has minimum Makespan time.
Response time: Response time is when a user makes a request and when the cloud infrastructure delivers a response. Minimizing response time is crucial to providing a seamless user experience and ensuring optimal performance.
Reliability: It indicates the system’s ability to effectively handle failures, prevent downtime, and maintain continuous service availability. To detect and mitigate failures promptly, ensure seamless failover mechanisms, and provide continuous and reliable service to users even in the event of disruptions or high load conditions.
Migration time: Migration time refers to the duration required to transfer workloads or applications from one server or data center to another within the cloud infrastructure. It encompasses the process of migrating virtual machines, containers, or services to optimize resource allocation and handle changes in demand.
Bandwidth: It represents the capacity or available channel for data communication. It also refers to the maximum data capacity that may be transferred across a network connection within a specific period. Adequate bandwidth is essential for efficient load balancing, as it ensures the smooth and timely flow of data between servers and clients.
Resource utilization: It refers to the efficient allocation and management of computing resources within a cloud infrastructure to meet the demands of varying workloads. It involves optimizing the utilization of servers, storage, network bandwidth, and other resources to maximize performance and minimize waste. It can be measured with the help of a mathematical formula, as given in Eq. (3):

$$R_{{es}} U\left( {VM_{k} } \right) = CTjk~/~Makespan$$

(3)

In equation (3), R_esU is the resource utilization of the k^th virtual machine (VM); CTjk is the completion time of the j^th job on the k^th VM.
Energy consumption: It can be defined as the ability of a cloud infrastructure to optimize its power consumption while maintaining optimal performance. It reduces energy consumption by dynamically allocating computing resources and powering down underutilized servers during low-demand periods. By minimizing power usage, cloud load balancing systems contribute to reducing carbon footprints, operational costs, and environmental impact while ensuring sustainable and eco-friendly operations in cloud computing environments.
Fault tolerance: A system can continue functioning uninterrupted in the presence of failures or errors. It involves designing load-balancing algorithms and mechanisms that can withstand and recover from various faults, such as server failures, network outages, or traffic spikes (Tawfeeg et al. 2022).

4.2 Taxonomy of load balancing algorithms and challenges associated with them

Mishra and Majhi (2020) have categorized the load balancing algorithms into four broad classes: Traditional, Heuristic, Meta-heuristic, and Hybrid. The authors have also explained the subcategories of meta-heuristic and hybrid algorithms based on their nature. Tawfeeg et al. (2022) have discussed three main categories of load-balancing algorithms, namely static, dynamic, and hybrid. Tripathy et al. (2023) mentioned in their review that the load-balancing algorithm based on their environment is generally classified into three main classes: static, dynamic, and nature-inspired. In this systematic review paper, we have tried to include the maximum range of algorithms by covering all the categories and sub-categories. Figure 6 represents all categories of load-balancing algorithms (Table 6).

Traditional Algorithms: Traditional algorithms are mainly classified into preemptive and non-preemptive. Preemptive means to forcefully stop an ongoing execution to serve a higher-priority task. After the completion of the execution of a higher-priority job, the preempted job is resumed. The priority of the task can be internal or external. Traditional algorithms commonly employed for load balancing include Round Robin (RR), Weighted Round Robin, Least Connection, and Weighted Least Connection. Round Robin assigns requests cyclically to each server, ensuring an equal distribution. Weighted Round Robin provides scalability by considering server weights and allocating a proportionate number of requests to each server based on its capabilities and performance (Praditha, et al. 2023). The Least connection (LC) algorithm assigns requests to the server with the fewest active connections, promoting load distribution efficiency. The Weighted Least Connection (WLC) enhances the previous algorithm by considering server weights. It assigns requests to servers with the least active connections, scaling the distribution based on server capabilities. Preemptive scheduling algorithms include round-robin and priority-based. Non-preemptive algorithms include Shortest Job First (SJF) and First Come First Serve (FCFS).
Heuristic-based Algorithms: Heuristic algorithms are problem-solving techniques that rely on practical rules, intuition, and experience rather than precise mathematical models. These are used to find approximate solutions in a reasonable amount of time. The heuristic algorithms aim to distribute workload efficiently among cloud and fog nodes. Compared to hybrid and meta-heuristic algorithms, heuristic algorithms are relatively straightforward and have reduced computational complexity. They often provide reasonable solutions but lack guarantees of optimality. There are two types of heuristic techniques: static and dynamic. When a task’s estimated completion time is known, the static heuristic is used. When tasks arrive dynamically, a dynamic heuristic can be applied. Algorithms like Min-min, Max-min (Mao et al. 2014), RASA, Modified Heterogeneous Earliest Finish Time (HEFT) (Dubey et al. 2018), Improved Max-min (Hung et al. 2019) and DHSJF (Seth and Singh 2019) are the prominent examples of the heuristic category.
Meta-heuristic based algorithms: Meta-heuristic algorithms are good at finding a global solution without falling into local optima. A meta-heuristic algorithm is a problem-solving technique that guides the search process by iteratively refining potential solutions. It is used to find approximate solutions for complex optimization problems, especially in cloud computing, where traditional algorithms often struggle due to the inherent complexity and dynamic nature of the environment. A particular meta-heuristic algorithm that has proven effective in cloud computing is the Genetic Algorithm (GA) (Rekha and Dakshayini 2019). GA mimics the process of natural selection, evolving a population of solutions to find strong candidates. By employing genetic operators like selection, crossover, and mutation, GA explores the solution space intelligently, adapting to changing conditions and providing near-optimal solutions for resource allocation, task scheduling, and load balancing in cloud computing environments. Other examples from the reviewed literature are GWO (Reddy et al. 2022), ACO (Dhaya and Kanthavel 2022), TBSLB PSO (Ramezani et al. 2014), TOPSIS-PSO (Konjaang et al. 2018), and Modified BAT (Latchoumi and Parthiban 2022). When two meta-heuristic methods are combined the new method is a hybrid meta-heuristic. An example of a hybrid metaheuristic is Ant Colony Optimization with Particle Swarm (ACOPS) (Cho et al. 2015).
Hybrid based algorithms: The hybrid algorithms integrate the advantages of centralized and distributed load-balancing algorithms to achieve better performance and scalability. It leverages the centralized approach to monitor and collect real-time information about the system’s state, workload, and resource availability (Geetha et al. 2024). Simultaneously, it incorporates distributed load-balancing techniques to efficiently divide the workload among fog nodes. This hybrid approach enhances the overall load-balancing efficiency, reduces network congestion, and improves the system’s response time. By dynamically adapting to changing workload patterns and resource availability, the hybrid algorithm ensures optimal resource utilization and enhances user satisfaction. A hybrid method that combines the Genetic Algorithm (GA) and the Grey Wolf Optimization Algorithm (GWO) is proposed by Behera and Sobhanayak (2024). The hybrid GWO-GA algorithm minimizes cost, energy usage, and Makespan. Similarly, other examples from the literature review are GAYA (Mishra and Majhi 2023), VMMSID (Brahmam and Vijay Anand 2024), DTSO-TS (Ledmi et al. 2024), etc.
ML-Centric algorithms: These algorithms combine machine learning facilities with existing algorithms to automate the function. This is one of the latest approaches in the research area and has proven to be the best way to deal with real-time-based scenarios. To address the challenges of load balancing, researchers have been increasingly focusing on machine-learning-centric algorithms. ML-based algorithms offer promising results in load balancing by dynamically allocating tasks based on workload characteristics and resource availability. These algorithms leverage ML techniques such as reinforcement learning, deep learning, and clustering to intelligently predict and allocate the workload across cloud fog computing environments. ML-centric algorithms deliver improved performance, reduced response time, and enhanced resource utilization by continuously learning from historical data and adapting to changing conditions. Furthermore, these algorithms also consider energy consumption and network traffic factors, ensuring a holistic load-balancing approach (Muchori and Peter 2022). Examples of ML-centric algorithms from reviewed literature are DRL (Ran et al. 2019), MADRL-DRA (Jyoti and Shrimali 2020), TS-DT (Mahmoud et al. 2022), FF-NWRDLB (Prabhakara et al. 2023) etc.

Table 6 A comparative analysis of state-of-the-art methods on publicly benchmarked datasets

Full size table

Table 7 provides a comprehensive overview of recent load balancing and task scheduling algorithms, presenting information on the technology proposed, comparing technologies, research limitations, results, tools used, and potential future directions. Additionally, Table 8 outlines the evaluation metrics, advantages/disadvantages of the technologies reviewed, and objectives of the study.

Table 7 Comprehensive study on load balancing and task scheduling techniques in cloud computing

Full size table

Table 8 Detailed literature review on advantages and disadvantages of various studies

Full size table

5 Applications areas of load balancing in cloud and fog computing

There are various areas of applications where load balancing is very crucial. The healthcare sector is one area where efficient resource utilization and load balancing are highly desirable. According to Mahmoud et al. (2018), Fog computing integrated with IoT-based healthcare architecture improves latency, energy consumption, mobility, and Quality of Service, enabling efficient healthcare services regardless of location. Fog-enabled Cloud-of-Things (CoT) system models with energy-aware allocation strategies result in more energy-efficient operations, which are crucial for healthcare applications sensitive to delays and energy consumption. Yong et al. (2016) propose a dynamic load balancing approach using SDN technology in a cloud data center, enabling real-time monitoring of service node flow and load state, as well as global resource assignment for uneven system load distribution. Dogo et al. (2019) introduced a mist computing system for better connectivity and resource utilization of smart cities and industries. According to the authors, Mist computing enables smart cities to intelligently adapt to dynamic events and changes, enhancing urban operations. Mist computing is more suitable for realizing smart city solutions where streets adapt to different conditions, promoting energy conservation and efficient operations. Similarly, Sharif et al. (2023) presented a paper that discusses the rapid growth of IoT devices and applications, emphasizing the need for efficient task scheduling and resource allocation in edge computing for health surveillance systems. The proposed Priority-based Task Scheduling and Resource Allocation (PTS-RA) mechanism aims to manage emergency conditions efficiently, meeting latency-sensitive tasks’ requirements with reduced bandwidth cost. On the same track, Aqeel, et al. (2023) proposed a CHROA model that can be utilized for energy-efficient and intelligent load balancing in cloud-enabled IoT environments, particularly in healthcare, where real-time applications generate large volumes of data. Sah Tyagi et al. (2021) presented a neural network-based resource allocation model for an energy-efficient WSN-based smart Agri-IoT framework. The model improves dynamic clustering and optimizes cluster size. The approach combines the use of BPNN (Backpropagation Neural Network), APSO (Adaptive Particle Swarm Optimization), and BNN (Binary Neural Network) to accomplish the effective allocation of agricultural resources. This integration showcases notable progress in cooperative networking and overall optimization of resources. In the same manner, Dhaya and Kanthavel (2022) emphasize the importance of energy efficiency in agriculture, and the challenges in resource allocation, and introduce a novel algorithm ‘Naive Multi-Phase Resource Allocation Algorithm’ to enhance energy efficiency and optimize agricultural resources effectively in a dynamic environment. In this way, there are several application areas where load balancing and resource scheduling is crucial. In future, transportation, industry 4.0 and 5.0, IoT network systems, Smart cities, smart agriculture, and healthcare systems will be hotspots for research on load balancing. The following are the areas where resource allocation and utilization are critical, and where cloud service utilization is highest:

1.
Telemedicine (Verma et al. 2024)
2.
Industry 4.0 and Industry 5.0 (Teoh et al. 2023)
3.
Healthcare system (Talaat et al. 2022)
4.
Agriculture (Agri-IoT) (Dhaya and Kanthavel 2022; Sah Tyagi et al. 2021)
5.
Real-time monitoring services (Yong et al. 2016)
6.
Smart cities (Alam 2021)
7.
Digital twining (Zhou et al. 2022; Adibi et al. 2024)
8.
Smart business and analytics (Nag et al. 2022)
9.
E-commerce (Sugan and Isaac Sajan 2024)

6 Research queries and inferences

After a detailed literature review, the answers to the research questions have been inferred successfully without any bias or by adding views from researchers. Below, the inferences drawn are given in the form of answers:

We elucidate the answers to the research questions below to provide a thorough understanding based on the examination of existing material.

Q1. What load balancing and task scheduling techniques are commonly used in cloud computing environments?

This SLR divides the current techniques into five categories: traditional, heuristic, meta-heuristic, ML-centric, and hybrid. We employed the content analysis method to determine the category of each technique used in the literature study, as shown in Table 7. From the literature review, it has been inferred that hybrid, meta-heuristic and ML-centric algorithms/techniques are researchers’ favourite choices for solving load-balancing issues in a cloud computing system. The percentage-wise utilization of various techniques is depicted in Fig. 7. In the future, ML/DL-based load-balancing algorithms will be the hotspot for researchers as there is an emerging trend of hybridising ML-centric approaches with existing ones.

7 What are the key factors influencing the performance of load-balancing mechanisms in cloud computing?

The performance of load balancing in the cloud is influenced by several aspects, including the availability of resources such as CPU, memory, storage, and network bandwidth, the nature of the workload, network latency, the load balancer algorithm, and the health of the server as well as fault detection and tolerance. The selection of the load balancing algorithm can significantly influence performance, as different algorithms vary in complexity and efficiency, affecting how resources are distributed. In cases of server overload or issues, the load balancer must be able to identify these problems and redirect traffic to other servers to maintain optimal performance.

Q3. Which evaluation matrices are predominantly utilized for assessing the efficacy of load-balancing techniques in cloud computing environments?

The utilization trend of various metrics over the period 2014–2024 is shown graphically in Fig. 8. We have employed the frequency analysis method to determine the year-wise utilization of each performance metric. Table 8 provides an in-depth analysis of the performance metrics attained in every study. The year-wise categorization of each metric is shown in Table 9. The metrics most frequently used to gauge load balancing in cloud computing environments are Makespan, resource utilization (RU), Degree of Imbalance (DI), cost efficiency, throughput, and execution time. Evaluation metrics like fault tolerance, QoS, reliability and migration rate require additional attention without compromising other factors. The row named ‘other’ in Table 9 includes parameters like convergence speed, network longevity, fitness function, packet loss ratio, success rate, task scheduling efficiency, scalability, clustering phase duration, standard deviation of load, accuracy, precision and time complexity.

Table 9 Comprehensive metrics evaluation of the reviewed literature

Full size table

Q4. Which categories of algorithms have been used more in recent research trends in the cloud computing environment for solving load-balancing issues?

According to Fig. 9, it is inferred that the researchers prefer using hybrid algorithms for addressing load balancing and task scheduling problems in cloud computing. This preference arises because hybrid algorithms combine the functionalities of various algorithms, resulting in a precise and multi-objective solution to task scheduling and load-balancing challenges. During the period 2014, a heuristic approach was commonly used, but meta-heuristic approaches later replaced it. By 2022, the hybrid approach had become the dominant method. Interestingly, many of these hybrid techniques incorporate machine learning techniques to combine with other optimization methods.

Q5. Which simulation software tools have garnered prominence in recent scholarly analyses within the domain of cloud computing research?

Figure 10 shows that 51% of the researchers use the CloudSim tool for simulation purposes, followed by Python with 11%. We have employed the frequency analysis method to quantify and compare the utilization of different simulation tools within each study. According to the literature review, the CloudSim simulation tool is the first choice of researchers, with 51% utilization and has been used more in the last few years. It allows users to model and simulate cloud computing infrastructure, resource provisioning policies, and application scheduling algorithms. The CloudSim simulation tool is an external framework that is available for download and can be imported into various programming software options like Eclipse, NetBeans IDE, Maven, etc. To simulate the cloud computing environment, the CloudSim toolkit has been explicitly integrated with NetBeans IDE 8.2 and Windows 10 as operating systems (Vergara et al. 2023).

Q6. What insights do the future perspectives within the reviewed literature offer in terms of potential avenues for exploration and advancement within the field?

According to this article, the future directions of this field focus on developing more advanced algorithms that harness the potential of machine learning and deep learning, enabling enhanced energy efficiency and overall system performance in cloud computing environments. Real-time monitoring and automation of systems using the AI approach are also hot topics to explore in future research. The future scopes recorded during the literature review are shown in Table 7.

All the responses in this study are deduced and documented based on the above literature review. It is important to note that these responses are impartial and not generated by the researchers.

8 Statistical analysis

The SLR attempts a bibliographic analysis to understand the development and present condition of research in various domains and investigates the dissemination of scholarly materials, which can unveil both dominant patterns and possible deficiencies within the academic body of work. We used the Scopus academic database to collect important information based on the keywords “load balancing and task scheduling in cloud computing using machine learning”. A total of 129 items were found. This analysis centres on this dataset of 129 items, illustrating the distribution of documents published in many critical subject areas. It offers valuable insights into the current priorities and interests of the academic community.

These publications are distributed across various subjects, providing insights into the interdisciplinary nature of this field, as shown in Fig. 11.

9 Discussion

Our extensive literature study has discovered valuable insights and emerging trends crucial for advancing cloud computing technology. This discussion summarizes the research findings, answering the initial research questions and making conclusions based on a thorough examination of chosen studies conducted between 2014 and 2024.

9.1 Research gaps

Most research efforts are concentrated on a specific aspect of load balancing. Many systems are limited to either data center or network load balancing. There is an urgent necessity to address multiple aspects.

1.
Load balancing is a single point-of-failure issue. Furthermore, most of the research concentrates solely on a limited number of performance parameters, such as Makespan, throughput, completion time, etc. The degree of Imbalance (DI) is a crucial parameter to work on.
2.
There is a significant need to enhance quality measures such as QoS (Quality of Service), fault tolerance, network delay, VM (Virtual Machine) migration and risk assessment.
3.
The integration of fog and edge computing to mitigate the requirement for massive amounts of data transfer. This will improve the flexibility and usefulness of cloud computing in multiple sectors.
4.
Finally, the power conservation mechanism has not been given much thought by the researchers. There is a shortage of innovative thinking in power conservation when it comes to load balancing.
5.
Geographical barriers impose network delay and data transmission delay issues. We need to focus on the development of cutting-edge technologies to overcome distance-related and delay-related issues (Muchori and Peter 2022).
6.
Virtual Machine Migrations (VMM) is also a challenge that highly impacts the efficacy of cloud services. There is a dire need for design technologies that allow fewer VM migrations.
7.
Despite the advancements, applying machine learning algorithms in cloud computing is complicated. The intricacy of these algorithms, combined with the requirement for extensive training data, presents substantial obstacles. The dynamic nature of cloud environments requires constant learning and adjustment of these models, which raises questions about their ability to handle large-scale operations and maintain long-term viability.

9.2 Integration of machine learning for enhanced load balancing and task scheduling

One key insight from this analysis is a growing reliance on machine learning methods to enhance load balancing and task scheduling processes. Although somewhat successful, conventional algorithms generally struggle in dynamic cloud systems where data and workload patterns continuously change. Due to their capacity to acquire knowledge and adjust accordingly, machine learning algorithms have demonstrated potential in forecasting workload patterns, enabling the implementation of more effective resource allocation strategies. This enhances efficiency and substantially decreases execution time and energy consumption, aligning with the objectives of achieving optimal resource utilisation and high system throughput (Janakiraman and Priya 2023; Edward Gerald et al. 2023).

9.3 Future directions

The future of cloud computing rests on advancing auto-adaptive systems capable of independently handling load balancing and task scheduling without human involvement. Fusing artificial intelligence (AI) and cloud computing can create systems that provide unparalleled efficiency and reliability. Creating efficient cloud services could be significantly improved by developing lightweight machine learning models that require minimum training data and can quickly adapt to changing conditions. Moreover, investigating unsupervised learning algorithms can potentially eliminate the requirement for large, labelled data, enhancing the application’s practicality. These are some of the most frequently observed future scopes based on this SLR:

Deployment of deep learning (DL) and machine learning (ML) techniques to predict load patterns: The predictive analysis of workload patterns can prevent resource underutilization or overloading. We can also use ML to reduce energy consumption and predict faults in cloud computing (Reddy et al. 2022; Mishra and Majhi 2023; Agarwal et al. 2020; Negi et al. 2021; Latchoumi and Parthiban 2022; Shuaib, et al. 2023).
Development of fault tolerance techniques integrated with load balancing: Only a small number of research studies examine security concerns on cloud computing services, like load balancing and fault tolerance, without elaborating on the connection between the two (Behera and Sobhanayak 2024; Tawfeeg et al. 2022; Brahmam and Vijay Anand 2024).
To extend the existing techniques for data security and privacy by incorporating blockchain technology with cloud computing (Edward Gerald et al. 2023; Saba et al. 2023; Li et al. 2020).
To achieve more QoS metrics such as scalability, elasticity, and applicability to cover extensive domains, is also scoped to extend research work (Adil et al. 2022; Talaat et al. 2022; Sultana et al. 2024).
Most of the researchers have focused on the energy consumption aspect. Future research should aim to achieve energy efficiency as energy is going to be one of the scantiest resources in future (Rekha and Dakshayini 2019; Farrag et al. 2020; Panwar et al. 2019; Mahmoud et al. 2022; Asghari and Sohrabi 2021).
To achieve cost-effectiveness and real-time load balancing are prominent research areas. Most of the researchers have plans to extend their work to real-time analytics and dynamic cloud networks (Kumar and Sharma 2018; Ni et al. 2021).
Response delays in real-time applications are crucial. Real-time analytics in a complex and dynamic environment is a hotspot for researchers. Healthcare systems, telemedicine domains and real-time monitoring or surveillance services are examples of delay-sensitive applications (Verma et al. 2024; Pradhan et al. 2022; Nabi et al. 2022; Shahakar et al. 2023).
Dynamic reallocation of dependent tasks is another scope for future research. Task priority-based scheduling optimize the cloud performance (Ran et al. 2019; Jena et al. 2022; Prabhakara et al. 2023).
Fog and edge computing Architectures have limited resources, and optimal resource scheduling is essential. Many authors have also discussed resource scheduling in fog and edge computing as a potential future area of study (Swarup et al. 2021; Kruekaew and Kimpan 2022).

This SLR records the future research scopes mentioned above, and Table 7 provides detailed information.

10 Conclusion

The study of the computational cloud is vast and comes with numerous challenges. It allows end users to access computational processes, leading to many individuals’ widespread use of cloud services. This widespread adoption has made cloud computing an essential part of various businesses, notably online shopping sites. This increased usage has put more strain on cloud resources like hardware, software, and network devices. Consequently, we need load-balancing solutions for efficient utilization of these resources. This SLR categorizes technologies into five classes: conventional/traditional, heuristic, meta-heuristic, ML-Centric, and Hybrid. Traditional approaches are time-consuming, slow, and often stuck in local optima. Traditional algorithms struggle to scale with problem size and complexity, leading to slow processing and time-consuming behavior. Heuristic algorithms, which demonstrate remarkable scalability, are suitable for large-scale optimization challenges in industries like manufacturing, banking, and logistics. Heuristic algorithms often produce approximate answers rather than perfect ones; consequently, meta-heuristic algorithms emerged to address these drawbacks. In recent years, hybrid strategies, which combine heuristic, conventional, and machine-learning approaches, have become increasingly popular. These approaches aim to utilize the advantages of several algorithms to overcome limitations and improve performance. This systematic literature review conducted on efficient load balancing and task scheduling in a cloud computing environment has provided valuable insights into different algorithms, research limitations, evaluation metrics, challenges, simulation tools, and potential future directions. The analysis has demonstrated that the current trend in the cloud computing environment involves the utilization of ML-centric and hybrid algorithms to address load balancing and job/task scheduling issues effectively. Furthermore, the findings indicate a growing interest among researchers in ML-centric techniques, showcasing a shift towards incorporating ML/DL approaches. Our study explained the fundamental structure of cloud computing and its operational principles. A comprehensive examination of evaluation metrics and simulation tools is conducted impartially. Lastly, we addressed the research questions that formed the basis of this literature review, providing well-supported answers derived from the information gathered. This systematic review is a foundational resource for future scopes in this domain. It offers valuable information to researchers and practitioners involved in the domain of load balancing in cloud computing architecture. Additionally, this SLR does not delve into specific aspects concerning security and privacy considerations or issues related to load balancing. This will be retained as a topic for future investigation on our part. Table 10 provides abbreviations for several terms.

Table 10 Abbreviations used in SLR

Full size table

Data availability

No datasets were generated or analysed during the current study.

References

Adibi S, Rajabifard A, Shojaei D, Wickramasinghe N (2024) Enhancing healthcare through sensor-enabled digital twins in smart environments: a comprehensive analysis. Sensors. https://doi.org/10.3390/s24092793
Article Google Scholar
Adil M, Nabi S, Raza S (2022) PSO-CALBA: Particle swarm optimization based content-aware load balancing algorithm in cloud computing environment. Comput Inform 41(5):1157–1185. https://doi.org/10.31577/cai_2022_5_1157
Article Google Scholar
Adil M, Nabi S, Aleem M, Diaz VG, Lin JC-W (2023) CA-MLBS: content-aware machine learning based load balancing scheduler in the cloud environment. Expert Syst. https://doi.org/10.1111/exsy.13150
Article Google Scholar
Agarwal R, Baghel N, Khan MA (2020) Load balancing in cloud computing using mutation based particle swarm optimization. In: presented at the 2020 International Conference on Contemporary Computing and Applications, IC3A 2020, pp 191–195 https://doi.org/10.1109/IC3A48958.2020.233295
Alahmad Y, Agarwal A (2024) Multiple objectives dynamic VM placement for application service availability in cloud networks. J Cloud Comput. https://doi.org/10.1186/s13677-024-00610-2
Article Google Scholar
Alam T (2021) Cloud-based iot applications and their roles in smart cities. Smart Cities 4(3):1196–1219. https://doi.org/10.3390/smartcities4030064
Article Google Scholar
Alatoun K, Matrouk K, Mohammed MA, Nedoma J, Martinek R, Zmij P (2022) A novel low-latency and energy-efficient task scheduling framework for internet of medical things in an edge fog cloud system. Sensors. https://doi.org/10.3390/s22145327
Article Google Scholar
Apat HK, Nayak R, Sahoo B (2023) A comprehensive review on internet of things application placement in Fog computing environment. InteRnet Things Neth. https://doi.org/10.1016/j.iot.2023.100866
Article Google Scholar
Aqeel I et al (2023) Load balancing using artificial intelligence for cloud-enabled internet of everything in healthcare domain. Sensors. https://doi.org/10.3390/s23115349
Article Google Scholar
Asghari A, Sohrabi MK (2021) Combined use of coral reefs optimization and reinforcement learning for improving resource utilization and load balancing in cloud environments. Computing 103(7):1545–1567. https://doi.org/10.1007/s00607-021-00920-2
Article Google Scholar
Behera I, Sobhanayak S (2024) Task scheduling optimization in heterogeneous cloud computing environments: a hybrid GA-GWO approach. J Parallel Distrib Comput. https://doi.org/10.1016/j.jpdc.2023.104766
Article Google Scholar
Biswas D, Dutta A, Ghosh S, Roy P (2024) future trends and significant solutions for intelligent computing resource management, pp 187–208 https://doi.org/10.4018/979-8-3693-1552-1.ch010
Brahmam MG, Vijay Anand R (2024) VMMISD: an efficient load balancing model for virtual machine migrations via fused metaheuristics with iterative security measures and deep learning optimizations. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3373465
Article Google Scholar
Buyya R et al (2018) A manifesto for future generation cloud computing: research directions for the next decade. ACM Comput Surv. https://doi.org/10.1145/3241737
Article Google Scholar
Cho K-M, Tsai P-W, Tsai C-W, Yang C-S (2015) A hybrid meta-heuristic algorithm for VM scheduling with load balancing in cloud computing. Neural Comput Appl 26(6):1297–1309. https://doi.org/10.1007/s00521-014-1804-9
Article Google Scholar
Dhaya R, Kanthavel R (2022) Energy efficient resource allocation algorithm for agriculture IoT. Wirel Pers Commun 125(2):1361–1383. https://doi.org/10.1007/s11277-022-09607-z
Article Google Scholar
Dogo EM, Salami AF, Aigbavboa CO, Nkonyana T (2019) Taking cloud computing to the extreme edge: a review of mist computing for smart cities and industry 4.0 in Africa. In: EAI/Springer Innovations in Communication and Computing, pp 107–132 https://doi.org/10.1007/978-3-319-99061-3_7
Dubey K, Kumar M, Sharma SC (2018) Modified HEFT algorithm for task scheduling in cloud environment. In: presented at the Procedia Computer Science, pp 725–732 https://doi.org/10.1016/j.procs.2017.12.093
Edward Gerald B, Geetha P, Ramaraj E (2023) A fruitfly-based optimal resource sharing and load balancing for the better cloud services. Soft Comput 27(10):6507–6520. https://doi.org/10.1007/s00500-023-07873-y
Article Google Scholar
Farrag AAS, Mohamad SA, El-Horbaty ESM (2020) Swarm optimization for solving load balancing in cloud computing. In: presented at the advances in intelligent systems and computing, pp 102–113 https://doi.org/10.1007/978-3-030-14118-9_11
Geetha P, Vivekanandan SJ, Yogitha R, Jeyalakshmi MS (2024) Optimal load balancing in cloud: Introduction to hybrid optimization algorithm. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2023.121450
Article Google Scholar
Goel G, Tiwari R (2023) Resource scheduling techniques for optimal quality of service in fog computing environment: a review. Wirel Pers Commun 131(1):141–164. https://doi.org/10.1007/s11277-023-10421-4
Article Google Scholar
Hashem W, Nashaat H, Rizk R (2017) Honey bee based load balancing in cloud computing. KSII Trans Internet Inf Syst 11(12):5694–5711. https://doi.org/10.3837/tiis.2017.12.001
Article Google Scholar
Hung TC, Hy PT, Hieu LN, Phi NX (2019) MMSIA: improved max-min scheduling algorithm for load balancing on cloud computing. In: presented at the ACM International Conference Proceeding Series, pp 60–64 https://doi.org/10.1145/3310986.3311017
Huo L, Shao P, Ying F, Luo L (2019) The research on task scheduling algorithm for the cloud management platform of mimic common operating environment. In: presented at the Proceedings - 2019 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science, DCABES 2019, pp 167–171 https://doi.org/10.1109/DCABES48411.2019.00049
Jalalian Z, Sharifi M (2022) A hierarchical multi-objective task scheduling approach for fast big data processing. J Supercomput 78(2):2307–2336. https://doi.org/10.1007/s11227-021-03960-9
Article Google Scholar
Janakiraman S, Priya MD (2023) Hybrid grey wolf and improved particle swarm optimization with adaptive intertial weight-based multi-dimensional learning strategy for load balancing in cloud environments. Sustain Comput Inform Syst. https://doi.org/10.1016/j.suscom.2023.100875
Article Google Scholar
Jena UK, Das PK, Kabat MR (2022) Hybridization of meta-heuristic algorithm for load balancing in cloud computing environment. J King Saud Univ Comput Inf Sci 34(6):2332–2342. https://doi.org/10.1016/j.jksuci.2020.01.012
Article Google Scholar
Joshi S, Panday N, Mishra A (2024) Reinforcement learning based auto scaling strategy used in cloud environment: State of Art, p 736 https://doi.org/10.1109/CSNT60213.2024.10545922
Jyoti A, Shrimali M (2020) Dynamic provisioning of resources based on load balancing and service broker policy in cloud computing. Clust Comput 23(1):377–395. https://doi.org/10.1007/s10586-019-02928-y
Article Google Scholar
Khodar A, Chernenkaya LV, Alkhayat I, Fadhil Al-Afare HA, Desyatirikova EN (2020) Design model to improve task scheduling in cloud computing based on particle swarm optimization. In: presented at the Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, EIConRus 2020, pp 345–350 https://doi.org/10.1109/EIConRus49466.2020.9039501
Kiruthiga G, Maryvennila S (2020) Robust resource scheduling with optimized load balancing using grasshopper behavior empowered intuitionistic fuzzy clustering in cloud paradigm. Int J Comput Netw Appl 7(5):137–145. https://doi.org/10.22247/ijcna/2020/203851
Article Google Scholar
Konjaang JK, Ayob FH, Muhammed A (2018) Cost effective Expa-Max-Min scientific workflow allocation and load balancing strategy in cloud computing. J Comput Sci 14(5):623–638. https://doi.org/10.3844/jcssp.2018.623.638
Article Google Scholar
Kruekaew B, Kimpan W (2022) Multi-objective task scheduling optimization for load balancing in cloud computing environment using hybrid artificial bee colony algorithm with reinforcement learning. IEEE Access 10:17803–17818. https://doi.org/10.1109/ACCESS.2022.3149955
Article Google Scholar
Kumar M, Dubey K, Sharma SC (2018) Elastic and flexible deadline constraint load Balancing algorithm for Cloud Computing. In: presented at the Procedia Computer Science, pp 717–724 https://doi.org/10.1016/j.procs.2017.12.092
Kumar M, Sharma SC (2018) Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment. Comput Electr Eng 69:395–411. https://doi.org/10.1016/j.compeleceng.2017.11.018
Article Google Scholar
Latchoumi TP, Parthiban L (2022) Quasi oppositional dragonfly algorithm for load balancing in cloud computing environment. Wirel Pers Commun 122(3):2639–2656. https://doi.org/10.1007/s11277-021-09022-w
Article Google Scholar
Ledmi A, Ledmi M, Souidi MEH, Haouassi H, Bardou D (2024) Optimizing task scheduling in cloud computing using discrete tuna swarm optimization. Ing Syst Inf 29(1):323–335. https://doi.org/10.18280/isi.290132
Article Google Scholar
Li X, Qin Y, Zhou H, Chen D, Yang S, Zhang Z (2020) An intelligent adaptive algorithm for servers balancing and tasks scheduling over mobile fog computing networks. Wirel Commun Mob Comput. https://doi.org/10.1155/2020/8863865
Article Google Scholar
Liu X, Qiu T, Wang T (2019) Load-balanced data dissemination for wireless sensor networks: a nature-inspired approach. IEEE Internet Things J 6(6):9256–9265. https://doi.org/10.1109/JIOT.2019.2900763
Article Google Scholar
Mahmoud MME, Rodrigues JJPC, Saleem K, Al-Muhtadi J, Kumar N, Korotaev V (2018) Towards energy-aware fog-enabled cloud of things for healthcare. Comput Electr Eng 67:58–69. https://doi.org/10.1016/j.compeleceng.2018.02.047
Article Google Scholar
Mahmoud H, Thabet M, Khafagy MH, Omara FA (2022) Multiobjective task scheduling in cloud environment using decision tree algorithm. IEEE Access 10:36140–36151. https://doi.org/10.1109/ACCESS.2022.3163273
Article Google Scholar
Mao Y, Chen X, Li X (2014) Max–min task scheduling algorithm for load balance in cloud computing. Adv Intell Syst Comput 255:457–465. https://doi.org/10.1007/978-81-322-1759-6_53
Article Google Scholar
Mishra K, Majhi SK (2020) A state-of-art on cloud load balancing algorithms. Int J Comput Digit Syst 9(2):201–220. https://doi.org/10.12785/IJCDS/090206
Article Google Scholar
Mishra K, Majhi SK (2023) A novel improved hybrid optimization algorithm for efficient dynamic medical data scheduling in cloud-based systems for biomedical applications. Multimed Tools Appl 82(18):27087–27121. https://doi.org/10.1007/s11042-023-14448-4
Article Google Scholar
Mousavi S, Mosavi A, Varkonyi-Koczy AR (2018) A load balancing algorithm for resource allocation in cloud computing. In: presented at the advances in intelligent systems and computing, pp 289–296 https://doi.org/10.1007/978-3-319-67459-9_36
Muchori J, Peter M (2022) Machine learning load balancing techniques in cloud computing: a review. Int J Comput Appl Technol Res 11:179–186. https://doi.org/10.7753/IJCATR1106.1002
Article Google Scholar
Nabi S, Ahmad M, Ibrahim M, Hamam H (2022) AdPSO: adaptive PSO-based task scheduling approach for cloud computing. Sensors. https://doi.org/10.3390/s22030920
Article Google Scholar
Nag A, Sen M, Saha J (2022) Integration of predictive analytics and cloud computing for mental health prediction. In: Predictive Analytics in Cloud, Fog, and Edge Computing: Perspectives and Practices of Blockchain, IoT, and 5G, pp 133–160 https://doi.org/10.1007/978-3-031-18034-7_8
Neelakantan P, Yadav NS (2023) An optimized load balancing strategy for an enhancement of cloud computing environment. Wirel Pers Commun 131(3):1745–1765. https://doi.org/10.1007/s11277-023-10520-2
Article Google Scholar
Negi S, Rauthan MMS, Vaisla KS, Panwar N (2021) CMODLB: an efficient load balancing approach in cloud computing environment. J Supercomput 77(8):8787–8839. https://doi.org/10.1007/s11227-020-03601-7
Article Google Scholar
Ni L, Sun X, Li X, Zhang J (2021) GCWOAS2: multiobjective task scheduling strategy based on gaussian cloud-whale optimization in cloud computing. Comput Intell Neurosci. https://doi.org/10.1155/2021/5546758
Article Google Scholar
Oduwole O, Akinboro S, Lala O, Fayemiwo M, Olabiyisi S (2022) Cloud computing load balancing techniques: retrospect and recommendations. FUOYE J Eng Technol 7:17–22. https://doi.org/10.46792/fuoyejet.v7i1.753
Article Google Scholar
Pabitha P, Nivitha K, Gunavathi C, Panjavarnam B (2024) A chameleon and remora search optimization algorithm for handling task scheduling uncertainty problem in cloud computing. Sustain Comput Inform Syst. https://doi.org/10.1016/j.suscom.2023.100944
Article Google Scholar
Pang S, Zhang W, Ma T, Gao Q (2017) Ant colony optimization algorithm to dynamic energy management in cloud data center. Math Probl Eng. https://doi.org/10.1155/2017/4810514
Article Google Scholar
Panwar N, Negi S, Rauthan MMS, Vaisla KS (2019) TOPSIS–PSO inspired non-preemptive tasks scheduling algorithm in cloud environment. Clust Comput 22(4):1379–1396. https://doi.org/10.1007/s10586-019-02915-3
Article Google Scholar
Prabhakara BK, Naikodi C, Suresh L (2023) Ford fulkerson and Newey West regression based dynamic load balancing in cloud computing for data communication. Int J Comput Netw Inf Secur 15(5):81–95. https://doi.org/10.5815/IJCNIS.2023.05.08
Article Google Scholar
Pradhan A, Bisoy SK, Kautish S, Jasser MB, Mohamed AW (2022) Intelligent decision-making of load balancing using deep reinforcement learning and parallel PSO in cloud environment. IEEE Access 10:76939–76952. https://doi.org/10.1109/ACCESS.2022.3192628
Article Google Scholar
Praditha VS et al (2023) A Systematical review on round robin as task scheduling algorithms in cloud computing. In: presented at the 2023 6th International Conference on Information and Communications Technology, ICOIACT 2023, pp 516–521 https://doi.org/10.1109/ICOIACT59844.2023.10455832
Prashanth SK, Raman D, (2021) Optimized dynamic load balancing in cloud environment using B+ Tree. In: presented at the Advances in Intelligent Systems and Computing, pp 391–401 https://doi.org/10.1007/978-981-33-4859-2_39
Ramezani F, Lu J, Hussain FK (2014) Task-based system load balancing in cloud computing using particle swarm optimization. Int J Parallel Prog 42(5):739–754. https://doi.org/10.1007/s10766-013-0275-4
Article Google Scholar
Ran L, Shi X, Shang M (2019) SLAs-aware online task scheduling based on deep reinforcement learning method in cloud environment. In: presented at the Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, pp 1518–1525 https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00209
Reddy KL, Lathigara A, Aluvalu R, Viswanadhula UM (2022) PGWO-AVS-RDA: An intelligent optimization and clustering based load balancing model in cloud. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.7136
Article Google Scholar
Rekha PM, Dakshayini M (2019) Efficient task allocation approach using genetic algorithm for cloud environment. Clust Comput 22(4):1241–1251. https://doi.org/10.1007/s10586-019-02909-1
Article Google Scholar
Rostami S, Broumandnia A, Khademzadeh A (2024) An energy-efficient task scheduling method for heterogeneous cloud computing systems using capuchin search and inverted ant colony optimization algorithm. J Supercomput 80(6):7812–7848. https://doi.org/10.1007/s11227-023-05725-y
Article Google Scholar
Saba T, Rehman A, Haseeb K, Alam T, Jeon G (2023) Cloud-edge load balancing distributed protocol for IoE services using swarm intelligence. Clust Comput 26(5):2921–2931. https://doi.org/10.1007/s10586-022-03916-5
Article Google Scholar
Sabireen H, Neelanarayanan V (2021) A Review on Fog computing: architecture, Fog with IoT, algorithms and research challenges. ICT Express 7(2):162–176. https://doi.org/10.1016/j.icte.2021.05.004
Article Google Scholar
Sah Tyagi SK, Mukherjee A, Pokhrel SR, Hiran KK (2021) An intelligent and optimal resource allocation approach in sensor networks for smart Agri-IoT. IEEE Sens J 21(16):17439–17446. https://doi.org/10.1109/JSEN.2020.3020889
Article Google Scholar
Santhanakrishnan M, Valarmathi K (2022) Load balancing techniques in cloud environment - a big picture analysis, p 310 https://doi.org/10.1109/ICCST55948.2022.10040387
Seth S, Singh N (2019) Dynamic heterogeneous shortest job first (DHSJF): a task scheduling approach for heterogeneous cloud computing systems. Int J Inf Technol Singap 11(4):653–657. https://doi.org/10.1007/s41870-018-0156-6
Article Google Scholar
Shafiq DA, Jhanjhi N, Abdullah A (2019) Proposing a load balancing algorithm for the optimization of cloud computing applications. In: presented at the MACS 2019 - 13th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics, Proceedings https://doi.org/10.1109/MACS48846.2019.9024785
Shahakar M, Mahajan S, Patil L (2023) Load balancing in distributed cloud computing: a reinforcement learning algorithms in heterogeneous environment. Int J Recent Innov Trends Comput Commun 11(2):65–74. https://doi.org/10.17762/ijritcc.v11i2.6130
Article Google Scholar
Shakkeera L, Tamilselvan L (2016) QoS and load balancing aware task scheduling framework for mobile cloud computing environment. Int J Wirel Mob Comput 10(4):309–316. https://doi.org/10.1504/IJWMC.2016.078201
Article Google Scholar
Sharif Z, Tang Jung L, Ayaz M, Yahya M, Pitafi S (2023) Priority-based task scheduling and resource allocation in edge computing for health monitoring system. J King Saud Univ Comput Inf Sci 35(2):544–559. https://doi.org/10.1016/j.jksuci.2023.01.001
Article Google Scholar
Shetty S, Shetty S (2019) Analysis of load balancing in cloud data centers. J Ambient Intell Humaniz Comput 15:1–9. https://doi.org/10.1007/s12652-018-1106-7
Article Google Scholar
Shuaib M et al (2023) An optimized, dynamic, and efficient load-balancing framework for resource management in the internet of things (IoT) environment. Electron SwiTz. https://doi.org/10.3390/electronics12051104
Article Google Scholar
Souri A, Norouzi M, Alsenani Y (2024) A new cloud-based cyber-attack detection architecture for hyper-automation process in industrial internet of things. Clust Comput 27(3):3639–3655. https://doi.org/10.1007/s10586-023-04163-y
Article Google Scholar
Sugan J, Isaac Sajan R (2024) PredictOptiCloud: A hybrid framework for predictive optimization in hybrid workload cloud task scheduling. Simul Model Pract Theory. https://doi.org/10.1016/j.simpat.2024.102946
Article Google Scholar
Sultana Z, Gulmeher R, Sarwath A (2024) Methods for optimizing the assignment of cloud computing resources and the scheduling of related tasks. Indones J Electr Eng Comput Sci 33(2):1092–1099. https://doi.org/10.11591/ijeecs.v33.i2.pp1092-1099
Article Google Scholar
Swarna Priya RM et al (2020) Load balancing of energy cloud using wind driven and firefly algorithms in internet of everything. J Parallel Distrib Comput 142:16–26. https://doi.org/10.1016/j.jpdc.2020.02.010
Article Google Scholar
Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. In: presented at the Procedia Computer Science, pp 42–51 https://doi.org/10.1016/j.procs.2021.03.016
Talaat FM, Ali HA, Saraya MS, Saleh AI (2022) Effective scheduling algorithm for load balancing in fog environment using CNN and MPSO. Knowl Inf Syst 64(3):773–797. https://doi.org/10.1007/s10115-021-01649-2
Article Google Scholar
Tawfeeg TM et al (2022) Cloud dynamic load balancing and reactive fault tolerance techniques: a systematic literature review (SLR). IEEE Access 10:71853–71873. https://doi.org/10.1109/ACCESS.2022.3188645
Article Google Scholar
Teoh YK, Gill SS, Parlikad AK (2023) IoT and Fog-computing-based predictive maintenance model for effective asset management in industry 4.0 using machine learning. IEEE Internet Things J 10(3):2087–2094. https://doi.org/10.1109/JIOT.2021.3050441
Article Google Scholar
Tong Z, Deng X, Chen H, Mei J (2021) DDMTS: a novel dynamic load balancing scheduling scheme under SLA constraints in cloud computing. J Parallel Distrib Comput 149:138–148. https://doi.org/10.1016/j.jpdc.2020.11.007
Article Google Scholar
Tripathy SS et al (2023) State-of-the-art load balancing algorithms for mist-fog-cloud assisted paradigm: a review and future directions. Arch Comput Methods Eng 30(4):2725–2760. https://doi.org/10.1007/s11831-023-09885-1
Article Google Scholar
Ullah A, Chakir A (2022) Improvement for tasks allocation system in VM for cloud datacenter using modified bat algorithm. Multimed Tools Appl 81(20):29443–29457. https://doi.org/10.1007/s11042-022-12904-1
Article Google Scholar
Vasile M-A, Pop F, Tutueanu R-I, Cristea V, Kołodziej J (2015) Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. Future Gener Comput Syst 51:61–71. https://doi.org/10.1016/j.future.2014.11.019
Article Google Scholar
Velpula P, Pamula R, Jain PK, Shaik A (2022) Heterogeneous load balancing using predictive load summarization. Wirel Pers Commun 125(2):1075–1093. https://doi.org/10.1007/s11277-022-09589-y
Article Google Scholar
Vergara J, Botero J, Fletscher L (2023) A comprehensive survey on resource allocation strategies in fog/cloud environments. Sensors. https://doi.org/10.3390/s23094413
Article Google Scholar
Verma R, Singh PD, Singh KD, Maurya S (2024) Dynamic load balancing in telemedicine using genetic algorithms and fog computing. In: presented at the AIP Conference Proceedings https://doi.org/10.1063/5.0223933
Walia R, Kansal L, Singh M, Kumar KS, Mastan Shareef RM, Talwar S (2023) Optimization of load balancing algorithm in cloud computing. In: presented at the 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE 2023, pp 2802–2806 https://doi.org/10.1109/ICACITE57410.2023.10182878
Yong W, Xiaoling T, Qian H, Yuwen K (2016) A dynamic load balancing method of cloud-center based on SDN. China Commun 13(2):130–137. https://doi.org/10.1109/CC.2016.7405731
Article Google Scholar
Zhan ZH, Zhang GY, Gong YJ, Zhang J (2014) Load balance aware genetic algorithm for task scheduling in cloud computing. Lect. Notes Comput. Sci. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinforma. 8886: pp. 644–655 https://doi.org/10.1007/978-3-319-13563-2_54.
Zhou X et al (2022) Intelligent small object detection for digital twin in smart manufacturing with industrial cyber-physical systems. IEEE Trans Ind Inform 18(2):1377–1386. https://doi.org/10.1109/TII.2021.3061419
Article Google Scholar

Download references

Acknowledgements

We would like to express our gratitude and appreciation to Rabdan Academy Abu Dhabi UAE for their generous support and funding that made this research possible. Their contribution has been invaluable in enabling us to carry out this work to a high standard.

Funding

There is no funding associated with this work.

Author information

Authors and Affiliations

Department of Computer Science & Applications, Maharshi Dayanand University, Rohtak, Haryana, India
Nisha Devi & Sandeep Dalal
Department of CSE, UIET, Maharshi Dayanand University, Rohtak, Haryana, India
Kamna Solanki
Department of Computer Science and Engineering, Amity University Haryana, Gurugram, India
Surjeet Dalal
Department of Computer Science and Engineering, Galgotias University, Greater Noida, UP, India
Umesh Kumar Lilhore & Sarita Simaiya
Department of Spectrum Management, Afghanistan Telecommunication Regulatory Authority, Kabul, 2496300, Afghanistan
Nasratullah Nuristani

Authors

Nisha Devi
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Dalal
View author publications
You can also search for this author in PubMed Google Scholar
Kamna Solanki
View author publications
You can also search for this author in PubMed Google Scholar
Surjeet Dalal
View author publications
You can also search for this author in PubMed Google Scholar
Umesh Kumar Lilhore
View author publications
You can also search for this author in PubMed Google Scholar
Sarita Simaiya
View author publications
You can also search for this author in PubMed Google Scholar
Nasratullah Nuristani
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SD & UKL: Design and methods, KS & SSD: Conclusion and review of the first draft, SS & NN: Introduction and background, SSD & KS: Results and analysis, NM& NN: Discussion and review of the final draft, NN & SD: Conceptualization and corresponding authors.. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Surjeet Dalal or Nasratullah Nuristani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Devi, N., Dalal, S., Solanki, K. et al. A systematic literature review for load balancing and task scheduling techniques in cloud computing. Artif Intell Rev 57, 276 (2024). https://doi.org/10.1007/s10462-024-10925-w

Download citation

Accepted: 22 August 2024
Published: 05 September 2024
DOI: https://doi.org/10.1007/s10462-024-10925-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A systematic literature review for load balancing and task scheduling techniques in cloud computing

Abstract

Similar content being viewed by others

Cloud resource allocation schemes: review, taxonomy, and opportunities

A Comprehensive Study on Cloud Computing: Architecture, Load Balancing, Task Scheduling and Meta-Heuristic Optimization

A comprehensive examination of load balancing algorithms in cloud environments: a systematic literature review, comparative analysis, taxonomy, open challenges, and future trends

Explore related subjects

1 Introduction

1.1 Need for load balancing, factors affecting and associated challenges

1.2 Motivation for the study

1.3 Objectives of the SLR

1.4 Research contributions of the SLR

2 Methodology of the systematic literature review

2.1 Search criteria and quality assessment:

2.2 Inclusion–exclusion criteria

2.3 Research questions:

3 Cloud-fog architecture and relevant frameworks

3.1 Working principles

3.2 Cloud computing layer

3.3 Fog computing layer

3.4 IoT applications layer

4 Literature review on load balancing (LB) and task scheduling

4.1 Some essential load balancing metrics:

4.2 Taxonomy of load balancing algorithms and challenges associated with them

5 Applications areas of load balancing in cloud and fog computing

6 Research queries and inferences

7 What are the key factors influencing the performance of load-balancing mechanisms in cloud computing?

8 Statistical analysis

9 Discussion

9.1 Research gaps

9.2 Integration of machine learning for enhanced load balancing and task scheduling

9.3 Future directions

10 Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation