1 Introduction

Today, urban areas are home to more than half of the world’s population. It is projected that by 2050 [1], around seven out of every 10 people will call a city home. More than two-thirds of all human-caused greenhouse gas emissions and between sixty and eighty percent of all human-caused energy use occur in urban areas. Increased social divides, gridlock, water pollution, and related health problems are the results of our more urbanized world [2]. Countries can use Information and Communication Technologies (ICT) combined with renewable energy and other technologies such as the Internet of Things (IoT) [3] and Big Data [4] to build smarter, sustainable cities for their citizens.

Smart systems support users in decision-making by synthesizing data from multiple sensors to control and take appropriate actions [5]. Miniaturized computing, sensing, actuation, and communication capabilities define “smart” systems (optical, biological, and mechanical). By “smart waste management,” researchers entail any approach that employs the use of information and communication technologies, such as optimizing waste disposal in terms of time, money, and ecological impact. IoT-based smart resource management [6] involves collecting and analyzing data to optimize the utilization of any given resource, be it garbage, chickens, or crops. Smart waste management, as an example, seeks to maximize resource utilization, reduce operational expenses, and increase the sustainability of waste services.

A smart and sustainable city is forward-thinking and uses information and communication technology (ICT) to raise living standards, streamline municipal processes, and boost economic vitality. At the same time, they are making certain that it will fulfill the economic, social, environmental, and cultural requirements of both the current generation and those to come in the foreseeable future. The smart city is a vision of natural resources where urban planning and sustainable development practices are developed and implemented to meet the increasing challenges of urbanization. It consists of information and communication technology (ICT), supported by big data that smart cities use to implement and improve residents’ efficiency, quality of life, and the Internet of Things (IoT). All things are represented and exist on the Internet. Connecting to the real and digital worlds is a primary goal of the Internet of Things. Interactions between things and cloud-based applications [7, 8] such as environmental monitoring, object tracking, traffic management, healthcare, and remote monitoring are made possible primarily through machine-to-machine communication. Through better asset utilization, more efficient processes, and higher productivity, businesses can use the Internet of Things to save a lot of money.

Multi-objective, contradicting evolutionary algorithms (MOEA) have shown to be the best approach for weighing [9] when targets have been set. They also have several advantages over other optimization strategies. Network administrators can choose from a bigger collection of solutions because of their ability to deliver various viable solutions in one round, each with a unique interplay of targets. This allows them to choose the optimal option carefully. MOPs aim most importantly at finding non-dominated solutions (PO solutions) in evolutionary algorithms (EAs). Multi-objective optimization algorithms (MOOs) like EA and optimization techniques are used in many [10] fields for solving diverse MOPs. Another research outlines evolutionary multi-objective optimization theory and methodologies in [11]. The essential ideas of multi-objective optimization and evolutionary algorithms are clarified in this article. Authors [12] thoroughly examine evolutionary, multi-objective algorithms for a wide range of technological challenges.

Internet of Things (IoT) technology combines the explosive growth of low-cost smart devices such as sensors, wireless networks, big data, computing power, and connectivity with the huge amounts of big data produced by the Internet of Things. The Internet of Things (IoT) and big data technologies can lead to promising prospects for smart city development [13]. Continuous stream data are generated by the IoT device in a scalable manner, allowing for actions and analytics to be performed on massive amounts of stream data. Such measures and examinations may involve event correlation, matrix computation, statistical set-up, and analytics to build the smart city. Hence, IoT and big data can lead to promising prospects in smart city development. The management of waste [14] in a smart city is an important part of the community’s infrastructure. The ability to aggregate bin locations, detect the waste status in each bin, and process this aggregate data would be crucial for any smart waste city management system to be implemented. The result of this paper will be valuable input data for the intelligent waste management system that reduces and considers the missing data generated from waste bin sensors and calculates the optimal path to prevent spoilage risks, waste pollution, and resource consumption. The proposed system will operate as an intelligent system in which users can take appropriate safety measures regarding the waste management system.

The following are the main contributions of this paper:

  • Improves the energy of smart bins (SBs) consumed when smart bins send data into the disposal center and reduce missing data.

  • Completing missing data emitted from SBs to deal with complete data leads to producing accurate results.

  • The optimal path for the waste collection trucks takes the most efficient routes to the SBs.

  • Introduces a framework for waste management systems in smart cities based on IoT.

This paper is organized as follows. Related works are discussed in Sect. 2. Section 3 discusses problem formulation and constraints. The essential concepts for multi-objective (MO) and the standard algorithm for the artificial hummingbird algorithm are provided in Sect. 4. The proposed system IWMS that solves missing data from the dataset and smart bins truck routing is discussed in Sect. 5. The results of the experiments and discussion are discussed in Sects. 6 and 7. Section 8 discusses the study's conclusion as well as research directions for the future.

2 Literature review and related work

This section thoroughly examines several related works on waste management for smart cities based on the Internet of Things. In [15], authors use the Internet of Things (IoT) sensors to gather waste features such as waste bin size, size of the waste, and smell in the bin to alert truck drivers, waste management, and authorities, and garbage bins equipped with low-cost sensors to collect waste features such as the size of waste bin, waste size, and smell in the bin to alert truck drivers, waste management, and authorities. Embedded devices can be found around the city in various locations. Shnu et al. in [16] include two units that track bins in public and household regions; PBLMU (Public Bin Level Monitoring Unit) and HBLMU (Home Bin Level Monitoring Unit). The PBLMUs and HBLMUs collect data about the trash bin’s unfilled level and location, process it, and send it to a central station for processing and storing.

Bano et al. in [17] developed smart waste bin monitoring based on IoT and management of municipal waste capable of effectively collecting waste, discovering fires in waste, and predicting estimating of waste generation in the future. In [18], the authors propose SCGCMS, which developed a smart waste monitoring system based on IoT technologies for public garbage collection. The system is divided into two parts. Part one involves placing rubbish bins in various areas and filling them randomly, and part two consists of determining the path for collecting vans based on the bin-filling ratio.

Anh Khoa et al. in [19] rely on machine learning (ML) in an IoT paradigm and apply graph theory and ML to pick the best path for waste collection based on forecasting the likelihood of rubbish in trash bins. The proposed system is utilized for real-time monitoring by integrating multichoice. In the best tracking and flexibility, energy is supplied to each bin in the network from various sources, including solar and batteries.

A smart waste management system based on IoT technologies is suggested [20]. They use multiple technologies in public waste collectors to measure the status of rubbish containers in real time. An optimum path for the waste-collecting bin depends on the waste bins, ultimately decreasing fuel costs. Garbage bins are divided into master and slave bins by SWMS. There are three sensors in each rubbish bin: a level sensor, a humidity sensor, and a load sensor. The master bin is in charge of sending data to the database cloud.

Gull et al. in [21] proposed an intelligent eNose food waste management system that uses a decision tree to discover food items containing MQ4 to identify CH4 gas and MQ135 to identify CO2 and NH3. In this proposed eNose system, each type of food in the dataset is represented by 70 instances. Set the rules and put this system to work to measure the volume of food waste and forecast food items. Waste collection trucks in [22] usually come once or twice every seven days. The waste in the garbage is distributed on the streets due to poor garbage collection techniques. The researcher solves this problem using a smart and effective waste management solution based on machine learning (ML) and IoT with image processing to calculate the waste value of a specific waste dump. Waste electrical and electronic equipment (WEEE) recovery is critical for environmental protection and resource conservation. Aspects of Air Cargo Handling Process Safety and Sustainability Environmental justice must be incorporated as one of the strategic goals to be attained [23].

Erdinç et al. in [24] proposed Mixed-Integer Linear Programming MILP to select waste truck routing to determine the best route for waste collection. The energy consumption value for electric garbage trucks was compared. The energy consumption value has increased the reality of the results by approximately 38% compared to the value obtained under various situations.

Most of the related literature deals with forecasting waste generation, energy consumption, and the path of waste trucks. But it does not consider any critical constraints like missing data, which is transmitted from smart bins; waste size for each smart bin; energy of each smart bin; and the smell of foul odor emitted from waste, while this paper deals with these constraints when it proposes the intelligent waste management system (IWMS). The problem of missing data was solved via imputation, enhancing the energy consumption of intelligent waste bins (SBs), extending the life of the smart waste network, and selecting the best path routing of waste trucks to reduce fuel efficiency and time. Table 1 summarizes the related works.

Table 1 Summary of the related work reviews

3 Problem description and constraints

3.1 Problem description

This section discusses the problem that entails determining each vehicle’s optimal pathways to minimize total distance, bad odor emissions, and total costs, including vehicle costs and GHG emissions costs. High-priority waste bins are in specified areas (e.g., a fuel station, a hospital, a petrol station, etc.) and should be collected as soon as possible. The cars arrive at the disposal facility and begin their journey to the designated garbage bins. When the garbage collection vehicle is fully loaded or the collecting task is accomplished, it must return to the disposal facility to upload the garbage that has been collected. According to the problem formulation, there are some assumptions. They are:

A graph G = (SB, E) can represent the desired network, where SB = 1, 2,…,n denotes the number of nodes for smart bins, and E indicates the collection of links. Each pair of nodes forms a network edge, consisting of two arcs pointing in opposite directions. Each edge in set E can be either a waste (needed) or a non-waste edge—the following assumptions are considered in developing the smart waste management system.

  • G represents each region that contains waste bins responsible for each vehicle.

  • There are specific places for waste smart bins SB = {SB1, SB2,, SBn} where n denotes the number of bins.

  • ESB denotes energy for each bin.

  • SBT denotes to temperature for waste found in each bin.

  • Harmful odor emission from bin represented by SBOdor

  • The vehicles are located in the main center for the disposal of waste.

  • There is only one source of waste disposal for each region represented by D.

3.2 Problem constraints

Many constraints affect routing between vehicles and bins, such as the garbage in bins, the bad odor emitted from bins that affect people, and the energy of bins. These constraints affect the direction of routing from this truck vehicle into bins. Each truck travels back to the original point where it began. The truck switches to a separate set of smart waste containers in between. Each collecting location is visited by a truck. The roads must cover a greater range, and there is no straight route between the collection places. The vehicle collecting the waste container will revisit it if it is full or stop by the next waste container if it is not sufficiently filled. The material collected is disposed of at garbage recycling stations. The garbage cans are then updated. With this method, we can monitor waste container occupancy rates. The truck cannot be stopped on the street we use for garbage pickup. Because the truck’s carrying capacity is insufficient, it must go back to the waste container and load the garbage into the truck for recycling before disposing of the waste. Equations (14) represent the constraints that affect waste management system routing and are defined as follows:

$$\sum\limits_{i = 1}^{b} {V_{i} = 1}$$
(1)
$$\sum\limits_{i = 1}^{T} {SB_{i} }$$
(2)
$$\sum\limits_{i = 1}^{b} {(GZ)_{i} }$$
(3)
$$\sum\limits_{j = 1}^{k} {(E_{SB} )_{j} }$$
(4)
  • Constraint (1) Equation 1. One truck vehicle V responsible for one area moves from the disposal center into smart bins and back to the disposal center after emptying a waste bin, b represents the number of smart bins, and V represents the truck for each region.

  • Constraint (2) Equation 2. Each smart garbage bin contains many types of garbage and smells of bad odor that is emitted from garbage that affects routing when trucks move from disposal into correct smart bins. Types of bad smell odor represented by T and SB represent each bin according to i to know the volume of bad odor in each smart bin.

  • Constraint (3) Equation 3. Garbage size for each smart bin is considered one of the constraints. GZ represents the size of garbage in each smart bin represented by b.

  • Constraint (4) Equation 4. \(E_{SB}\) The energy of each smart bin is considered the important constraint affecting the waste bin’s smart management system, and k represents the number of the smart bin.

4 Preliminaries

4.1 Multi-objective optimization problems (MOPs)

MOPs have various objective functions, as described in introduction. Because of the nature of these problems, the key challenge of multi-objective optimization is to find the right balance between the objectives and solutions, which can be recognized by the dominating relationship, also known as Pareto efficiency or optimality [25, 26]. Without sacrificing generality, the multi-objective optimization issue can be expressed as follows:

$${\text{Min}} f\left( {\vec{x}} \right) = \left[ {f_{1} \left( {\vec{x}} \right.), \ldots ,f_{n} \left( {\vec{x}} \right.) } \right] , \,\,\,\vec{x} = \left\{ {x_{1, \ldots , } } \right.x_{k} \} \in S$$
(5)
$${\text{Subject}}\,\, {\text{to}}: g_{j} \left( {\vec{x}} \right) \ge 0, \,\,\,\left( {j = 1,2, \ldots ,m} \right)$$
(6)
$$h_{j} \left( {\vec{x}} \right) = 0,\,\,\, \left( {j = 1,2, \ldots ,p} \right)$$
(7)
$$L_{j} \le x_{j} \le U_{j} ,\,\,\, j = 1,2, \ldots ,n$$
(8)

where S denotes the collection of solutions (i.e., S is solution space). The objective functions denote by \(f_{1} \left( {\vec{x}} \right.), \ldots ,f_{n} \left( {\vec{x}} \right.)\), constraints of inequality represent by \(g_{j} \left( {\vec{x}} \right)\) and constraints of equality represent by \(h_{j} \left( {\vec{x}} \right)\), inequality constraints count represented by m, objectives count represented by o, equality constraints count represented by p, and the boundaries of the ith variable are referred to by [Lj, Uj].

Due to the nature of MOPs, arithmetic relational operators cannot assess distinct solutions. This issue was thus rectified as a result of PO dominance. The reader may read more about Pareto’s supremacy in [27]. In the literature, many archives are utilized to solve swarm intelligence problems using many MOPs. Coverage and convergence are considered two significant concerns in searching for the correct PO solutions for a particular problem. Convergence means determining accurate approaches to PO solutions using a multi-objective algorithm. The dispatching by the goal of the obtained PO solutions is known as coverage.

4.2 Artificial hummingbird algorithm (AHA)

The artificial hummingbird algorithm (AHA) is a brand-new bio-inspired algorithm (AHA) proposed by Weiguo Zhao [28] to tackle global optimization issues [29]. Hummingbirds are beautiful creatures and the tiniest birds in the world. If intelligence is determined by the brain-to-body ratio, hummingbirds will be the most intelligent creatures on the planet. Hummingbirds can fly at up to 45 km/h. The largest seeds can rise in the air at a rate of about 12 beats per second, while some of the smallest seeds can reach over 80 beats per second [30]. The metabolism of hummingbirds must be maintained through foraging, and nectar makes up the majority of their diet. The three parts of AHA are as follows:

  • Sources of Food The hummingbird evaluates the attributes of potential food sources, such as the nectar quality and content of individual flowers, the rate at which the flowers refill their nectar, and the frequency with which the hummingbird has previously visited the flowers.

  • Hummingbirds Every hummingbird always stays in the same location relative to its feeding source. A hummingbird can remember a specific food source’s location and nectar-refilling rate and share this knowledge with other hummingbirds in a population.

  • Visit Table The visit table keeps track of the frequency with which various hummingbird species visit various food sources, and the length of time since a given hummingbird species last visited a given food source.

    AHA has hummingbirds’ foraging behaviors, including guided foraging, territorial foraging, and migrating foraging.

  • Initialization phase A population of n hummingbirds is randomly put on n food sources using Eq. 9.

    $$x_{i} = {\text{Low}} + r \times \left( {{\text{Up}} - {\text{Low}}} \right) , \,\,\,\,i = 1, \ldots , n$$
    (9)

    Low and Up are considered the upper and lower boundaries, and xi represents the position of the ith food source. The visit table depicts in Eq. (10):

    $$VT_{{\left( {i,j} \right)}} = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {{\text{if}}\,\,i \ne j} \hfill \\ {\text{Null,}} \hfill & {i = j} \hfill \\ \end{array} } \right.$$
    (10)
  • Foraging with a guide hummingbirds have a biological drive to maximize their nectar intake; the ideal nectar source will have a rapid nectar refill rate and allow the hummingbird to go longer between feedings.

    When engaging in guided foraging behavior, hummingbirds are allowed to prioritize those food sources that provide the most frequent refills of nectar. Once it has been identified, this hummingbird can fly directly to the source of its chosen food. Foraging utilizes three distinct flight skills: omnidirectional, diagonal, and axial, as outlined in Eqs. (1113).

    $${\text{Definition}}\,\,{\text{for}}\,\,{\text{each}}\,\,{\text{flight:}}\,{\varvec{D}}^{{\left( {\varvec{i}} \right)}} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\,\,\,{\varvec{i}} = {\varvec{randi}}\left( {\left[ {1,{\varvec{d}}} \right]} \right)} \hfill \\ {0,} \hfill & {{\text{else}}} \hfill \\ \end{array} } \right.$$
    (11)
    $${\varvec{D}}^{{\left( {\varvec{i}} \right)}} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & \begin{gathered} {\text{if}}\,\,{\varvec{i}} = {\varvec{P}}\left( {\varvec{j}} \right),{\varvec{j}} \in \left[ {1,{\varvec{k}}} \right],\,\, \hfill \\ \,\,{\mathbf{P}}~\, = \,~{\mathbf{randperm}}\left( {\mathbf{k}} \right),{\varvec{k}} \in 2\left[ {[{\varvec{r}}1 \times \left( {{\varvec{d}} - 2} \right) + 1} \right] \hfill \\ \end{gathered} \hfill \\ {0,} \hfill & {{\text{else}}} \hfill \\ \end{array} } \right.$$
    (12)
    $$D^{\left( i \right)} = 1\quad i = 1,...,d$$
    (13)

    where \(randi\left( {\left[ {1,d} \right]} \right)\) produces a random integer between 1 and d, a random permutation of integers between 1 and k using rand per(k), and r1 is a random number in the range [0, 1].

  • Foraging on the frontier When the flower nectar at the hummingbird’s preferred feeding spot has been depleted, the bird is less likely to return there in search of more food. As a result, a hummingbird can easily migrate to a neighboring region within its territory using Eqs. (14) and (15).

    $${\varvec{\nu}}_{{\varvec{i}}} \left( {{\mathbf{t}} + 1} \right) = {\varvec{x}}_{{\varvec{i}}} \left( {\varvec{t}} \right) + {\varvec{bDx}}_{{\varvec{i}}} \left( {\varvec{t}} \right)$$
    (14)
    $${\varvec{b}}\sim {\varvec{N}}\left( {0,1} \right)$$
    (15)

    where b is a territorial factor and follows a normal distribution N(0,1) with mean of 0 and standard deviation of 1.

  • Foraging migration When the food supply in a hummingbird’s usual feeding area dwindles, the bird will often travel to a new location to replenish its energy reserves. Equation (16) describes the path a hummingbird takes as it moves from the nectar source with the slowest refill rate to one that is generated at random.

    $$x_{{{\text{war}}}} \left( {t + 1} \right) = {\text{low}} + r\left( {{\text{Up}} - {\text{low}}} \right)$$
    (16)

    where xwor refers to the food source with the lowest nectar refill rate.

5 The proposed intelligent waste management system (IWMS)

This section provides the proposed system that demonstrates a solution for smart waste management, as illustrated in Fig. 1. It helps workers who deal with waste directly, also protects citizens from waste damage, and improves the waste management network in terms of energy consumed and reducing the loss of data that comes out of sensors by imputing this extracted missing data and eventually choosing the best path for waste vehicles that feed waste from smart waste bins to reduce the time consumed by the truck and accessing it. As soon as possible to avoid the damage that occurs when the waste in the bins is full according to restrictions, thus it outlines the phases taken by the proposed system as follows.

Fig. 1
figure 1

The proposed intelligent waste management system

The first phase of the proposed system IWMS is clustering smart bins based on the LEACH algorithm to reduce energy consumption in all smart bin networks and reduce missing data generated from sensors in smart bins. Following that, the second phase discusses the missing data imputation approach based on AHA-KNN. The proposed AHA-KNN makes sure the data are complete; otherwise, if it is incomplete (missing data), the data will be completed. After completing the first and second phases, implement the new proposal for routing trucks to SBs based on the multi-objective artificial humming algorithm (MOAHA) to determine the best route from trucks that stay in the disposal center to the necessary smart bins according to constraints that affect the routing of smart bins. MOAHA gives a lot of non-dominated or Pareto-optimal front solutions that are better than the best results of the seven other multi-objective algorithms, which are explained in experiment Sect. 6.

5.1 Clustering smart bins based on AHA-LEACH algorithm

The first phase of the proposed system, IWMS, is to improve the clustering method LEACH using an artificial humming algorithm called AHA-LEACH which aims to decrease energy consumption, which is one of the main advantages of the LEACH method, and reduce missing data received from smart bin sensors that communicate data into the disposal center (DC). Liao et al. [31] proposed a LEACH clustering algorithm, so before showing how the proposed method works, it is important to examine the limitations of LEACH as it is now practiced and then discuss the proposed method.

LEACH uses two states in each round: set-up and steady state. It generates clusters in a self-adaptive mode in the cluster set-up state. It does, however, execute data transmission in a steady state. The time allotted to the second state for preserving the protocol payload is frequently longer than the time allotted to the first state. Figure 2 shows how the LEACH protocol works through the timeline, including the set-up and steady-state phases. During the set-up phase, the Cluster Head (CH) nodes are selected at random from each of the nodes in the network, and several clusters are formed dynamically. In the steady-state phase, each CH distributes to its cluster members after the clusters are established. Each node transmits the sensory data to its CH node. When CHs have collected all their members’ data, they send the aggregated data along with their own data to the base station (BS).

Fig. 2
figure 2

LEACH structure

Each node has a threshold, T, which is titled T(n), and each cluster chooses a random number from 0 to 1 to see if it is lower or higher. The threshold at which the cluster leader emerges from cluster leader status occurs if the chosen value is less than the threshold value. Each node serves as the cluster leader for one iteration. The value of the threshold is indicated by Eq. (17).

$$T\left( n \right) = \left\{ {\begin{array}{*{20}l} {0,} \hfill & {n \notin G} \hfill \\ {\frac{P}{{1 - P\left( {r mod \left( \frac{1}{P} \right)} \right)}},} \hfill & {n \in G} \hfill \\ \end{array} } \right.$$
(17)

where P refers to the target percentage of cluster members to be chosen as cluster head, the current iteration denotes by r, and G is the set of nodes that have not been preferred as a cluster head in the last 1/P iterations.

LEACH’s biggest drawback is that its CH selection algorithm does not take into consideration the location of nodes and the residual energy in relation to CH and BS, and CH is selected using a random cluster head selection algorithm. Nodes located far from the BS and/or those with less residual energy may be selected as CHs if the cluster head is only responsible for coordinating cluster activity and transmitting data to the BS, as this type of CH selection technique is not competent enough to ensure proper CH selection. For effective CH selection, mechanisms are required that account for the node’s distance from the BS and its residual energy.

AHA-LEACH is a proposed method for addressing the aforementioned issues with LEACH and reducing missing data sent from smart bins into disposal centers. Therefore, the transmission has been changed from smart bins (nodes) to CH, which sends aggregated data into a disposal center (DC). The proposed AHA-LEACH algorithm addresses the following research gaps: To initiate, the proposed algorithm AHA-LEACH uses the artificial humming algorithm to create optimized smart bin (node) clusters. AHA-LEACH clearly outperforms LEACH due to its enhanced set of rules and fitness functions, consideration for residual energy, and other factors. And then, AHA-LEACH uses special criteria for choosing CH that are based on the following:

  1. 1.

    If the average energy of a cluster is higher than the CH’s remaining energy for that cluster.

  2. 2.

    Swap out the existing CH with a node that has the lowest possible value for the fitness function.

  3. 3.

    The standard clusters are generated and placed in a random manner, whereas the advanced clusters are positioned and generated with consideration given to the density distribution of typical clusters.

Furthermore, it proposes a new enhanced LEACH (AHA-LEACH) algorithm in contrast to primitive LEACH to address the problems with existing LEACH and other variants of LEACH [32, 33]. For optimal smart bin (node) cluster formation, the enhanced algorithm, AHA-LEACH, makes use of the artificial hummingbird algorithm. AHA-LEACH is superior to LEACH because of its more advanced rule set, fitness functions, and consideration of residual energy, among other factors. The main goal of the proposed AHA-LEACH is to choose the best area generated clustered head with less energy dissipation. Figure 3 depicts the proposed algorithm’s flowchart. Cluster (AVGE) in this case stands for a cluster’s average vitality, and CH (ResE) for the CH’s residual energy. The main idea of the proposed AHA-LEACH algorithm is explained in steps and in detail in Algorithm (1). The experiment results in Sect. 6 include graphs that compare the performance of the proposed algorithm AHA-LEACH with that of an existing LEACH algorithm.

Fig. 3
figure 3

Flowchart of the proposed AHA-LEACH algorithm

figure a

5.2 Smart bins missing data imputation based on optimized K-NN

The previous section discussed an optimizer-LEACH clustering implementation that reduces missing data emitted from smart bins and decreases energy consumption. However, there are also missing data after it was previously reduced, so it is solved by optimizing K-NN using AHA to solve KNN’s slow complexity, flexibility, and effectiveness. An attempt is made to estimate missing values in data collected from sensors [34]. The main reason to solve the problem of missing data is that when the dataset’s missing rate is less than 1 percent, the effect on experimental results can be neglected. The missing rate of one percent to five percent will have a minor impact on the experimental outcomes, but it can be mitigated. However, if the missing rate is greater than five percent, the experiment’s results will be influenced. As a result, we must use effective values to forecast missing data to achieve reliable findings. The accuracy [35] and the root-mean-square error (RMSE) [36] were chosen as criterion for evaluating model performance, which measures the disparities between values predicted or estimator and the values observed (sample or population values).

The solution storage (ST) design depicts the input and output of the algorithm. The first step reads data from the dataset and saves it in ST. After that, implement min–max normalization, which is considered one of the most common methods to normalize the data and separates data into completed and incomplete data and saves it in ST, such as If a feature’s minimum value was 20 and its highest value was 40, the number 30 would be changed to around 0.5 because it is midway between the two. Here’s calculated it via Eq. (18):

$$\min - \max_{norm} = \frac{{{\text{value}} - \min }}{\max - \min }$$
(18)

The approach uses the standard KNN structure for parameters initialization. It begins by populating the number of individuals, with each generating an imputed dataset. The population assessment is based on accuracy and RMSE, abbreviated as ObjAcc and ObjRMSE. The initial population is next exposed to KNN processes, which organize the population based on fitness function dominance and help the selection criteria toward equally dispersed optimal solutions, using the data from the previous phase.

Following that, the starting population goes through the evolutionary process until the stopping condition is established at M maximum iterations. Each pair of found individuals is subjected to k neighbors’ operations to construct a dataset imputed using AHA, which improves K operator selection in each iteration before evaluating performance using RMSE and accuracy. The determination of instances phase is primarily concerned with computing the distance between each incomplete and completed dataset. In the imputation process, the most similar completed projects were utilized. The equation uses a mixture of two distance metrics to compute the distance between the incomplete data and the complete data by Eq. (19). After being imputed and saved in solution storage ST, the final step returns the dataset.

$$d\left( {x_{k} , x_{l} } \right) = \sqrt {\sum\nolimits_{i = 1}^{D} {d_{i } \left( {x_{ki} - x_{li} } \right)^{2} } }$$
(19)

K-nearest neighbor (KNN) is a popular method used to fill in gaps in data. However, it has a serious flaw because K is always optimally initialized. Consequently, the purpose of this paper is to investigate AHA with the aim of finding the best possible value for K in KNN. Algorithm 2 and Flowchart Fig. 4 detail the AHA-KNN algorithm that was proposed.

Fig. 4
figure 4

Imputation steps of dataset based on AHA-KNN algorithm

figure b

5.3 Smart bin trucks routing based on MOAHA

The majority of traditional vehicle routing research focuses on a single goal, but in the last decade, there has been a growing trend to allow several goals to be achieved simultaneously. Multiple objectives are considered in an issue, either to expand a classical problem, generalize a classical issue, or handle actual problems. The proposed multi-objective artificial hummingbird algorithm (MOAHA) is considered the newest approach for multi-objective optimization. The convergence ability and efficiency of Pareto-based multi-objective metaheuristics can suffer as the number of objectives grows [37]. In this paper, MOAHA with a fitness function is declared in the next section, which is introduced to solve the multi-objective problem.

By combining its two components, AHA is transformed into a multi-objective optimization algorithm. In the first step, we store in external storage (ES) the non-dominated PO solutions we have found so far, and the second step is to choose solutions from ES using a leader selection approach. To help the other search agents find a solution that is close to the global optimum, this leader’s main responsibility is to point them in the direction of potential parts of the search space. However, due to Pareto optimality notions, it is not possible to simply compare the solutions in a multi-objective search space. This problem has been addressed by developing a method for selecting leaders. The leader selection procedure finds the regions of the search space with the fewest solutions and selects one of them to present. Hypercubes are selected using a roulette wheel [38], with the associated probabilities depicted in Eq. (20):

$$P_{i} = \frac{c}{{N_{i} }}$$
(20)

where Ni denotes the obtained PO solutions in the i-th segment and c > 1.

ES also functions as a storage mechanism for non-dominated PO solutions, allowing them to be saved and retrieved. An important element in the ES is the external storage (ES) controller, which regulates both adding solutions and when the repository is full. Because the repository has a huge number of individuals, each iteration compares non-dominated solutions to repository residents. At each iteration, the external storage (ES) is modified, and solutions are placed on the ES using the multi-objective optimization concept. To choose a new solution to keep in the archive, each solution in the ES should be compared against every non-dominated solution, because the number of comparisons is dependent on the time it takes to compute.

Figure 5 presents the proposed multi-objective version of the artificial hummingbird algorithm (MOAHA). The MOAHA should employ the same search pattern and use the MOAHA to find solutions to multi-objective problems in smart bins. No one solution exists that optimally maximizes all of the objectives in a multi-objective optimization problem. Therefore, the proposed MOAHA algorithm aims to determine the PO set and/or choose trade-offs to achieve various objectives.

Fig. 5
figure 5

The flowchart of proposed trucks routing

According to Fig. 5, each iteration extracts the best routes from the truck into bins and saves them in table storage VT(i, j), which is generated after calculating a fitness function that achieves the desired goals, which include minimizing the distance between the truck and smart bins and reducing energy consumption for the smart bin network. Visit table (VT) that is stored in external storage (ES) to save non-dominated solutions. With each iteration, the network energy and distances are calculated, and the best solution is determined and stored in ES, which stores the optimal solution, which means finding a route with minimum distances and minimum energy consumption. The MOAHA algorithm selects the best route between the routes generated via MOAHA algorithm, and the route is cost-effective and consistent with the objective function equation (the fitness function). The next section presents the results of running the program in MATLAB, as well as a comparison of various approaches using the algorithm results.

5.3.1 Fitness function

The proposed intelligent waste management system involves two objectives, so we call it a multi-objective problem. They are:

  • Minimize energy consumption for smart bins ESB according to energy for each bin.

  • Minimize the distance between smart bins and garbage trucks’ DSB.

Mathematically, the two objective functions are defined based on Eqs. (21) and (22).

$${\text{Minimize}} \,\,O_{1} = \mathop \sum \limits_{n\epsilon N} f_{1} \left( x \right) = \frac{n}{{{\text{ESB}}}}$$
(21)
$${\text{Minimize}}\,\,O_{2} = f_{2} (x) = \frac{1}{{\sum\nolimits_{i = 1}^{{nSB_{{{\text{nei}}}} }} {{\text{DSB}} \times {\text{ESB}}_{{{\text{nei}}}} } }}$$
(22)

In Eqs. 20 and 21, it expresses N denotes the total number of smart bins, ESB represents the energy of each smart bin, and ESBnei energy for each smart bin for cluster head (CH) neighbors according to LEACH clustering which executes in the first phase in the system. DSB describes the distance between smart bins and trucks in the disposal center. Equations (14) are constraints on the route of the waste truck.

The first fitness function, f1(x), represents the mean energies of waste neighbor bins, and the second fitness function f2(x) refers to the smaller sum of energy per cluster of nodes surrounding the chosen CH and the shorter overall distance. Both fitness functions rely on each node’s position vector x. Equation (23) is a convenient formalism for routing optimization.

$${\text{min}}_{f\left( x \right)} = \left( {f_{1} \left( x \right),f_{2} \left( x \right)} \right)$$
(23)

The goal of this paper is to find an optimal route that minimizes the total distance traveled by waste trucks of varying sizes. It then optimizes a specific route for the waste truck based on the data it receive. Here, it aims to reduce the workload between the tools, according to the working time of the waste collection trucks.

6 Experiments and results

The results of the experiments in this section of the three phases took place and discussion in the Proposed Intelligent Waste Management System (IWMS). All the results are run on the same PC, which has the detailed settings shown in Table 2. All experiments were carried out using MATLAB, and the results were extracted and analyzed using it.

Table 2 The specification details settings

6.1 Results of the clustering-based optimization

The proposed AHA-LEACH executed for a network of smart bins in Fig. 6, Table 3, and Fig. 7 clarifies the status of energy for each smart bin before and after using it, which helps reduce energy consumption for each smart bin. One of the main reasons for saving energy and reducing missing data is that the long distance between the smart garbage bin and the disposal main center directly affects the large consumption of energy, so we used different clustering methods to determine the nearest location for a communication station to receive data from the smart garbage bin instead of sending it directly to the disposal main center to save energy consumption. We used different clustering methods such as fuzzy c-means (FCM) [39], fuzzy c-means differential evolution (FCDE) [40], LEACH [31], and proposed AHA-LEACH to reduce energy consumption. To calculate energy consumption (EC), multiply the wattage of the unit by the number of hours you use it to find the number of watt-hours spent each day. When smart bins run 24 h a day. Most smart bins consume 300–800 watts of electricity. Assume the smart bin uses only 300 watts, so we can calculate EC = 350 watts × 24 h = 8400 W-hours per day.

Fig. 6
figure 6

AHA-LEACH clustering for smart bins

Table 3 Energy consumption for sample each smart bins after using LEACH
Fig. 7
figure 7

Energy consumption chart for each smart bins

Table 4 and Fig. 8 present the results in tabular and graphical forms, respectively, showing the initial data clustering used in the proposed AHA-LEACH algorithm and the comparative graph of energy dissipation for each iteration of Fuzzy C-Mean (FCM), Fuzzy C-Mean differential Evolution (FCDE), LEACH, and AHA-LEACH, respectively. Figure 8 depicts the dissipation of energy in each iteration using FCM, FCDE, LEACH, and proposed AHA-LEACH. The AHA-LEACH has clusters of a steady count and adjusted bunch appropriation, so its energy dispersal is stabilized and insignificant when compared to FCM, FCDE, and the classic LEACH. The results depict that energy depletes in FCM, LEACH, and FCDE after 50, 100, 130 iterations, respectively, but in AHA-LEACH, it depletes after 200 iterations. Hence, we can say that AHA-LEACH is more enhanced than classical LEACH. The lifetime of node is increased in a more effective way, thereby increasing the smart bins network lifetime. Dissipated energy during LEACH and AHA-LEACH iterations is tabulated in Table 4. When compared to FCM, LEACH, and FCDE, AHA-LEACH is seen to be the most effective at dissipating energy. FCM is considered the worst case and FCDE is the intermediate best result between LEACH and AHA-LEACH.

Table 4 Dissipation of energy and number of data packets after each iteration between LEACH and AHA-LEACH
Fig. 8
figure 8

The energy dissipation and no of data packets for each iteration between LEACH and AHA-LEACH

6.2 Results for imputation of missing data

Table 5, which declares a sample of missing data before making imputation on it, and Tables 6 and 7, which represent means and KNN, respectively, are that the algorithms have filled in some of the missing data, but not all of them. Some of the data are missing, so we needed to improve the imputation of missing data and fill in all the missing data using the proposed AHA-KNN in Table 8.

Table 5 Sample data in a dataset with missing values
Table 6 Imputing missing data using mean
Table 7 Imputing missing data using KNN
Table 8 Imputing missing data using AHA-KNN

We can conclude from Figs. 9, 10, and 11 that AHA-KNN imputation outperforms other mean and KNN imputation in missing values in a dataset. AHA-KNN can figure out what the best value for k is and then apply weights to each attribute in the dataset. The mean errors of alternative imputation methods are also shown in the figures, with AHA-KNN having the lowest mean error among the three methods utilized in this work. Furthermore, by looking at the weights, it is evident that the most relevant predictive variables for determining class labels have larger weight values. Figure 12 depicts two colors, red and blue. The blue color describes the original data, and the red color depicts filling in missing data (reconstructing data missing) using the proposed AHA-KNN which impute missing data.

Fig. 9
figure 9

Mean errors of mean imputation and KNN imputation, and AHA-KNN imputation at different missing value rates in attribute current fill level in the dataset

Fig. 10
figure 10

Mean errors of mean imputation and KNN imputation and AHA-KNN imputation at different missing value rates in attribute battery health in the dataset

Fig. 11
figure 11

Mean errors of mean imputation and KNN imputation and AHA-KNN imputation at different missing value rates in attribute odor threshold value in the dataset

Fig. 12
figure 12

Visualization of reconstruct missing data (imputing)

6.3 Performance measures

  1. (1)

    RMSE: is an abbreviation for root-mean-square error, which is a frequently used measure of the precision of an imputation, and it is defined as follows, via Eq. (24):

    $${\text{RMSE}} = \sqrt {{1/m}~\sum\nolimits_{{i = 1}}^{m} {(e_{i} - ~\tilde{e}_{i} )^{2} }}$$
    (24)

    where the true value represents by \(e_{i}\), \(\tilde{e}_{i}\) is the imputed value of the missing data, and m represents the number of missing values.

  2. (2)

    Accuracy: Estimating missing values enables a dataset with some gaps to be treated as if it were complete. To show how imputation affects accuracy and to compare different imputation methods, we use the following accuracy measure Eq. (25):

    $${\text{Accuracy}} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} l(lC_{i} , \,\,RC_{i} )$$
    (25)
    $$l\left( {lC_{i} ,RC_{i} } \right) = \left\{ {\begin{array}{*{20}l} {1,} & { lC_{i} = RC_{i} } \\ {0,} & {{\text{Otherwise}}} \\ \end{array} } \right.$$
    (26)

    where n represents the entire dataset’s size, and the i-th instance’s results and the real class label are depicted by \(lC_{i} { }\) and \(RC_{i}\).

Figure 13 determines RMSE values for each k value selected and depicts the best k values that contain the smallest RMSE error. The root-mean-squared error (RMSE) for the dataset is displayed in Table 9 for the three missingness mechanisms: Missing Completely at Random (MCAR), Missing At Random (MAR), and Missing Not at Random (NMAR) with varying missing rates (10%, 20%, and 30%). With respect to root-mean-squared error (RMSE), AHA-KNN outperforms the alternatives regardless of the missingness mechanism. The mean imputation produced the worst results in this dataset. Table 9 demonstrates that a higher missing rate results in a larger RMSE. Under MAR, the difference between the ten and twenty percent missing rates, for instance, ranges from as low as 0.07 to as high as 0.0892. Therefore, the replacement of the missing data should be done with the mean. While the mean substitution will most likely introduce some bias, it will prevent even larger deviations due to information loss, which is especially problematic in cases where the missing rate is high.

Fig. 13
figure 13

RMSE of different K values for proposed AHA-KNN

Table 9 RMSE across three missingness mechanisms with different missing rates

The dataset is also used to prove the predictions made by the imputation methods. Figures 14, 15, and 16 demonstrate the results of iterative imputation for the estimation methods and the considered levels of missing data for the three missingness mechanisms. AHA-KNN outperforms the other methods, while the second-best method for handling missing data is KNN. Figures 14, 15, and 16 demonstrate the convergence of AHA-KNN and KNN depending on the different missingness mechanisms with various missing rates. In summary, AHA-KNN has better imputation performance than KNN and mean.

Fig. 14
figure 14

MCAR experimental results for the dataset with varying missing rates

Fig. 15
figure 15

MAR experimental results for the dataset with varying missing rates

Fig. 16
figure 16

NMAR experimental results for the dataset with varying missing rates

Table 10 demonstrates the accuracy of various algorithms on the dataset according to three missingness mechanisms with a 10% missing rate. For instance, using mean on the dataset in MCAR increases accuracy from 80.25 before optimization imputation to 84.47% and 94.21 after imputation by KNN and our proposed approach, respectively. AHA-KNN. In terms of accuracy, the proposed method is the most effective. In many cases, the AHA-KNN algorithm increases the accuracy of imputation.

Table 10 Accuracy results for several algorithms on the dataset based on various missingness mechanisms with a 10% missing rate

6.4 Results of the smart bins truck routing

This section reported the results of the proposed MOAHA routing algorithm. MOAHA versus MOSA [41], MOWOA [42], MSSA [43], MOGWO [44], MOGOA [45], MOPSO [46], MDEA [47] that abbreviate, respectively, to Multi-objective Simulated Annealing, Multi-Objective Whale Optimization Algorithm, Multi-Objective Salp Swarm Algorithm, Multi-Objective Grey Wolf Optimization, Multi-Objective Grasshopper Optimization Algorithm, Multi-Objective Particle Swarm Optimization, and Multi-Objective Differential Evolution Algorithm. All experiments were executed via MATLAB. The parameters settings of the proposed algorithm and each algorithm are shown in Table 11.

Table 11 Experiments parameter settings for each multi-objective algorithm

This experiment’s goal is to validate the proposed MOAHA algorithm and compare it to seven other algorithms: MOSA, MOWOA, MSSA, MOGWO, MOGOA, MOPSO, and MDEA. In this experiment, the proposed algorithm MOAHA has a faster running time than the other algorithms MOSA, MOWOA, MSSA, MOGWO, MOGOA, MOPSO, and MDEA according to the number of iterations (100, 200, 300, 400, 500), as shown in Table 12 and Fig. 17 which depicts in iterations 100, 200, and 300 there are close numbers of seconds between the proposed MOAHA and MSSA, but the proposed MOAHA is better than all other algorithms and other iterations 400 and 500 clarify MOAHA is shown to be a better running time than the rest of other algorithms.

Table 12 Running time for the different algorithms per second according to number of iterations
Fig. 17
figure 17

Running time for proposed MOAHA compared with other algorithms

The average fitness function obtained over iterations run is shown in this Table 13. The proposed MOAHA achieves the best performance by demonstrating its ability to select the best route between trucks into smart bins effectively. Similar results can be found in Tables 14 and 15, which show the best and worst fitness functions obtained over iterations runs, respectively.

Table 13 Mean fitness function for the different algorithms
Table 14 Best fitness function for the different algorithms
Table 15 Worst fitness function obtained from the different algorithms

The graphical representation of the mean, the best, and worst fitness functions is also shown in Fig. 18. The convergence of fitness function for compared various algorithms is clarified in Fig. 19. The best convergence to the majority of true Pareto-optimal fronts can be found with MOAHA, and the results of the proposed MOAHA algorithm superior to other algorithms such as MOSA, MOWOA, MSSA, MOGWO, MOGOA, MOPSO, and MDEA algorithms.

Fig. 18
figure 18

Different measures for fitness function

Fig. 19
figure 19

Convergence of the fitness function

To verify the effectiveness of the proposed routing MOAHA, we compare MOAHA with different multi-objective algorithms MOSA, MOWOA, MSSA, MOGWO, MOGOA, MOPSO, and MDEA for the same dataset. Four measures include the best solution, the worst solution, and the number of generations; Table 16 and Fig. 20 show that the MOAHA algorithm is better than other algorithms in the above four measures Inverted generational distance (IGD), generational distance (GD), metric of spread, and metric of spacing. Construct a “Pareto front” PF generated by MOAHA with the fitness function for iterations.

Table 16 Results of a fitness function for multi-objective algorithms
Fig. 20
figure 20

Average ranking of the comparison between the proposed MOAHA and the other seven algorithms with the fitness functions

Figure 21 connects the Pareto best options which are chosen by MOAHA based on the distances between Pareto optimal solutions and the ideal solution. The chosen answer should be the closest to the ideal solution in terms of Euclidean distance. Meanwhile, we observed the relationship between the number of runs and the tour length, and the results are shown in Fig. 22.

Fig. 21
figure 21

Best Pareto optimal front obtained by MOAHA according to fitness function scenario from 100 to 500 iterations

Fig. 22
figure 22

Visualization of routing waste truck in each algorithms according running from 1 into 500 Iterations

When compared to other algorithms like MOSA, MOWOA, MSSA, MOGWO, MOGOA, MOPSO, and MDEA the fitness function has various Pareto fronts. MOAHA has the best convergence Fig. 19 toward the vast majority of actual Pareto-optimal fronts. When compared to the MOSA, MOWOA, MOGWO, MOPSO, and MDEA algorithms, the proposed MOAHA algorithm produced the best results. (IGD), (GD), metric of spread, and metric of spacing are statistical outcomes. MOAHA outperformed all other algorithms across the board in terms of network size. Table 16 shows the results of routing for proposed MOAHA and seven algorithms through 100 to 500 iterations, and Fig. 22 depicts a visualization of routing from the truck into smart bins for the MOAHA method produces a more robust result than the other seven algorithms.

7 Discussion

Based on the above experimental results, the comparison of the evaluation metrics obtained for various models. The proposed system IWMS proved successful in saving energy for waste bins, generating missing data values, and choosing the best routing path for waste trucks that improves fuel efficiency and reduces the time consumed. For evaluation of each algorithm inside all phases of the system, we use various performance metrics that have been identified as having a promising role in better assessing the quality of solutions provided by evolutionary algorithms and performing better in comparison with them, thus becoming a key ingredient to support the preferences of decision-makers.

Although our work has yielded promising results, it does have some limitations. First of all, learning the parameters of each algorithm and making comparisons is a time-consuming process. Secondly, due to the limitation that the available data are not complete enough, there is a loss of data (missing data), so we addressed this problem by proposing a method called AHA-KNN for generating new data to get the best results. Finally, there is a lack of previous research studies on the topic. Citing and referencing prior research studies constitutes the basis of the literature review for our study, and these prior studies provide the theoretical foundations for the research question that we are investigating. However, depending on the scope of our research topic, prior research studies that are relevant to our paper might be limited.

8 Conclusion and future work

This paper proposes a new system for waste management called the intelligent waste management system (IWMS). The new system has solved several problems in waste management, including the energy problem, missing data, and the routing of the waste bin truck. To begin, the smart bins are clustered into groups using the optimization of Low Energy Adaptive Clustering Hierarchy (LEACH) via the AHA-LEACH technique. To limit the likelihood of excessive cluster head distribution, the appropriate number of cluster heads is estimated based on the overall energy consumption per round, reducing missing data. Then a new optimization version of the K-Nearest Neighbor was introduced to solve the second problem of completing the missing data via KNN, joining AHA optimization to choose the best k, and improving the complement of the data. The proposed AHA-KNN is verified and executed to reconstruct missing data compared with MEAN and standard KNN, and it finds that AHA-KNN has a low mean error compared to others. Moving to the last phase, routing is solved via a proposed algorithm (MOAHA) that chooses the best route truck from the disposal center into smart bins. New fitness functions are proposed in this paper to reduce energy consumption for smart bins. In addition, four measurement methods were utilized to evaluate the proposed MOAHA: IGD, GD, metric of spread, and metric of spacing. In comparison with seven well-known optimization algorithms, the suggested MOAHA can discover the optimal Pareto front (PF) and the best non-dominated solutions, according to testing. The MOAHA method can be used to solve real-world and technical problems in the future and can optimize the routing of multiple trucks, not only one truck from the disposal center into smart bins.