Introduction

Over the last three decades, several natural phenomena have been used to formulate new mathematical generalizations for solving optimization problems including medical imaging1,2,3, robotics, business management, mathematical science4,5, segmentation6,7,8,9,10, clustering11, feature selection12,13,14,15,16, among others17,18,19. These algorithms are used because of their simple implementation and low computational complexity. Apart from that, the algorithms are faster in convergence and have fewer parameters and better exploration (expl) as well as exploitation (expt) properties. The algorithms include swarm intelligent algorithms (SIAs) and evolutionary algorithms (EAs). EAs consists of genetic algorithm (GA)20, memetic algorithm (MA)21, scatter search (SS)22, differential evolution (DE)23, stochastic fractal search (SFS)24, among others. These algorithms are among the earliest known algorithms in this domain and use the theory of evolution. SIAs, on the other hand, follow swarming and are based on social groupings such as bird flocking, colonies of insects, herds of animals, and others. Some of the major algorithms of this group include particle swarm optimization (PSO)25, symbiotic organisms search (SOS)26, sandpiper optimization algorithm (SOA)27, red fox optimization (RFO)28, golden eagle optimizer (GEO)29, grasshopper optimization algorithm (GOA)30, clouded leopard optimization (CLO)31, hermit crab shell exchange (HCSE)32, mud ring algorithm (MRA)33, seahorse optimizer (SHO)34, escaping bird search (EBS)35, and honey badger algorithm (HBA)36, among others.

Naked mole rat algorithm (NMRA) is a recently proposed SIA37 and is applied to many domain research problems. It takes inspiration from the worker-breeder relationship of NMR’s found in nature. It is based on the mating of NMR’s and consists of two parts. The first part of the algorithm is the worker phase which consists of arbitrary solutions to control expl operation, whereas the second part is the breeder phase which is intended for expt operation. The algorithm is simple in implementation, but for higher dimensions, it becomes very difficult to optimize the results38. Apart from that, the algorithm still suffers from poor expl due to less randomization in the worker phase and there are chances of stagnation as well.

In the current article, a new hybrid variant of NMRA is proposed. This newly proposed algorithm uses a combination of three new algorithms, namely HBA36, dwarf mongoose algorithm (DMA)39 and Prairie dog optimizer (PDO)40. All of these algorithms are added in a specific manner to see which algorithm fits the best for which set of iterations. In general, we have used only those equations from these algorithms, which fit the best for our proposed approach. Here DMA, PDO and HBA-based equations are added to the worker phase of NMRA for enhanced expl operation. The exploitation operation of the basic NMRA is found to be highly exploitative and is kept intact. In order to deal with the stagnation problems, we introduce a new stagnation phase. Apart from the added advantages, the algorithm is made adaptive by using six different mutation/iw operators, including chaotic41, exponential decreasing42, linearly decreasing, oscillating43, simulated annealing44, and random. All of these modifications sum up to formulate the new algorithm.

In addition, the proposed hybrid algorithm is applied to the real-world problem of image fusion of infrared (IR) and visible (VIS) images. Image fusion is a process of merging two or more images that are obtained from various sensors45,46. This is done to obtain a highly informative image that contains details which cannot be comprehended by analysing its source images independently. VIS images tend to have superior visual quality and precision background attributes. However, IR images have more resilience against varying light levels and environmental deterioration47. Therefore, this work employs the proposed algorithm as an optimization strategy to perform region-based image fusion of IR and VIS images. Here, DHPN is used to perform segmentation of the VIS image to extract its salient features. The derived features are utilized for weight map computation of both VIS and IR images. These weight map functions are further used to form the final fused image.

The major highlights of this work are presented as:

  • The best-known equations of four new algorithms, namely PDO, DMA, HBA, and NMRA, are used to formulate a new multi-hybrid algorithm.

  • The equations of PDO, DMA, and HBA are incorporated into the worker phase of NMRA, for improved exploration operation, whereas the exploitation phase is kept same.

  • The parameters of the proposed algorithm are made self-adaptive in nature by using six different mutation/iw operators. These operators are chosen to perform better exploration and exploitation along with a balanced operation.

  • A new stagnation phase is introduced to help the algorithm come out of local optima stagnation and find the near-optimal solution.

  • To test the performance of the proposed algorithm, CEC 2005 and CEC 2019 benchmark problems are used. These are challenging problems, and any algorithm performing well on these problems can be considered as a potential candidate for future research.

  • Apart from these benchmark datasets, the proposed algorithm is further used for the optimization of real-world IR and VIS images. In the current study, we perform region-based image fusion of both IR and VIS images using the proposed DHPN algorithm.

Excluding introduction, the paper has seven sections. The second section provides details about the major algorithms used for formulating the proposed algorithm and a summary of literature. Section three details the requirement of the proposal, basic motivation, and extended novelty of the proposed work. We provide detailed results of the numerical benchmarks in Sect. 4. We have used two sets of benchmarks, namely CEC 200548 and CEC 2019 benchmarks49. For experimental tests, we have used various new and hybrid algorithms, including JADE50, LSHADE-SPACMA51, and success-history based adaptive DE (SHADE)52 among others. A detailed parametric study, along with dimension size and population size (pop size) variability, is also checked and tested to see the applicability of the proposed work for higher dimension problems. In Sect. 5, we have used the proposed optimization algorithm for region-based image fusion of IR and VIS images. In the sixth section, a detailed summary of the results is presented. This section also includes some drawbacks of the proposed algorithm, along with some future insights. Conclusions are drawn and some future directions are given in the final section.

Basics of nature-inspired algorithms

This section provides details on the algorithms used as the basis for the proposed algorithm.

Naked mole-rat algorithm

NMRA was developed using the data from naturally occurring mating behaviours of species of naked mole-rats (NMRs). NMRA is broken down into three phases. Apart from the introduction, the worker phase is known for expl operation, whereas the breeder phase is meant for expt. Both these phases are shared by practically all nature-inspired algorithms. The essential NMRA’s mathematical formulations are as follows:

Initialization: Initial NMR population is defined in this phase with dimensional vector D in [1, 2, ..., n] interval. Here, n stands for the best possible pop size and D means various parameter combinations of the test issue. Each NMR’s initialization is shown by:

$$\begin{aligned} N_{i,j}=N_{min,j}+r\times (N_{min,j}-N_{max,j}) \end{aligned}$$
(1)

where in the intervals [1, 2, ..., n] and [1, 2, ..., D], i and j are defined. \(N_{min,j}\) is the lower boundary and \(N_{max,j}\) is the upper boundary of the problem under investigation. r is the random number between [0, 1] with uniform distribution.

Worker phase: Here, the worker NMRs’ fitness is assessed, and according to the result, it may be qualified to join the breeder’s. A chance of mating exists if the worker NMRs join the breeder’s. Utilizing prior knowledge, the worker NMR solution is computed. The workers are generated using,

$$\begin{aligned} x_{i}^{t+1}=x_{i}^{t}+\lambda (x_p^t-x_q^t) \end{aligned}$$
(2)

where \(o_i^t\) for the t th iteration is the solution of the i th worker, and \(o_i^{t+1}\) is the new solution produced by leveraging the prior solution. The arbitrarily solutions selected are \(o_i^t\), \(x_p^t\) and \(x_q^t\). The mating factor \(\lambda\) has a uniform distribution and falls between [0, 1].

Breeder phase: To mate with the queen or continue breeding, NMRs in breeder groups must become more physically fit. The best breeders \(n_{best}\) update their position according to the probability of breeding (bp). Here, it is important to keep in mind that the fitness of a certain section of breeders does not improve over time and, hence, is shifted to the worker group. The new solutions in this phase are given by

$$\begin{aligned} y_{i}^{t+1}=(1-\lambda )y_i^t+\lambda (n_{best}-y_i^t) \end{aligned}$$
(3)

where \(y_i^t\) is the result of the i th breeder during the t th iteration, \(\lambda\) denotes a parameter that regulates mating and \(y_i^{t+1}\) is the new breeder produced in the subsequent iteration. Initially set to 0.5, the value of bp can be any number between [0,1].

Grey wolf optimizer

As the name suggests, GWO is inspired by grey wolves. All the individuals have a strict dominance hierarchy categorized as beta, alpha, delta and omega wolves. The mathematical model of GWO is given as:

Social hierarchy: The appropriate solution for each generation is considered as alpha \((\alpha )\) wolf. Respectively, beta \((\beta )\) is the second and third is delta \((\delta )\); and the rest are omegas \((\omega )\).

Encircling prey: Grey wolf’s ability is to recognize prey and encircle it. Mathematically, this behaviour is formulated as

$$\begin{aligned}{} & {} D=|P.x_p^t-x^t| \end{aligned}$$
(4)
$$\begin{aligned}{} & {} x^{t+1}=x_p^t-R.D \end{aligned}$$
(5)

where t represents the iteration counter, \(x_p^t\) and \(x^t\) denote the position of prey and grey wolf, respectively. The values of R and P are as:

$$\begin{aligned} R=2a.r_1-a \end{aligned}$$
(6)
$$\begin{aligned} P=2.r_2 \end{aligned}$$
(7)

where \(a \in [2,0]\) and \(r_1\), \(r_2\) \(\in\) [0,1].

Hunting: It is done by \(\alpha\), \(\beta\) and \(\delta\). Guided by \(\alpha\), here \(\beta\) and \(\delta\) might also participate in hunting occasionally. Thus, the remaining two agents (\(\omega\) wolves) will also participate in hunting, and their position will be decided according to the positions of \(\alpha\), \(\beta\), \(\delta\). The mathematical expression used in this phase is:

$$\begin{aligned}{} & {} D_{\alpha }=|Y_1.x_{\alpha }-x|;\hspace{5pt} D_{\beta }=|Y_2.x_{\beta }-x|;\hspace{5pt} D_{\delta }=|Y_2.x_{\delta }-x| \end{aligned}$$
(8)
$$\begin{aligned}{} & {} o_1=x_{\alpha }-G_1.(D_{\alpha });\hspace{5pt} o_2=x_{\beta }-G_2.(D_{\beta });\hspace{5pt} o_3=o_1=x_{\delta }-G_3.(D_{\delta }) \end{aligned}$$
(9)
$$\begin{aligned}{} & {} x^{t+1}=\frac{o_1+o_2+o_3}{3} \end{aligned}$$
(10)

Attacking the prey (expt phase): At the very last stage, the prey is attacked as per the hunting mechanism. The stage of a first contains, 2 which is now been decreased to 0. It will continuously and correspondingly affect the parameter A as per the Eq. (6). Thus, the effect of the equation will affect the value A, which will be in the range of [−1,1]. As per the equation, the search for the next prey will definitely be done. Thus, for \(|A|<1\), the wolves will forcefully attack.

Searching prey (expl phase): For the search for prey algorithm, gray wolves must diverge their directions, which means that it has two options. Mathematically, this includes, \(A \le -1\) or \(A > 1\). These two options will allow the algorithm to go further. In the situation of \(|A|>1\) the wolves get separated and find new prey. C helps the algorithm to control expl by assigning random weights and hence following a random behaviour for avoiding local optima.

Cuckoo search

CS draws inspiration from nature and can handle challenging optimization issues. The main inspiration for this method came from the parasitic behavior of the natural cuckoo species. CS consists primarily of global (expl) and local search (expt), to obtain the optimum solution.

Global search phase: This phase focuses on the expl phase of the CS algorithm and makes the assumption that a cuckoo only lays a single egg. With this supposition, a new solution \(o_i^{t+1}\) was evolved, employing Lévy flights for the i th cuckoo. The solution is given by:

$$\begin{aligned} o_i^{t+1}=o_i^{t}+\alpha \hspace{2pt}\otimes Levy (\lambda ) (x_{best}-o_i^t) \end{aligned}$$
(11)

where \(o_i^{t}\) denotes the solution of i th cuckoo at t th iteration, \(o_i^{t+1}\) represents the next solution and \(x_{best}\) is the current best solution. Here, an arbitrary component of the Lévy distribution’s flight model is applied. This part, which basically imitates the cuckoo species’ flight path, is characterized as:

$$\begin{aligned} Levy\sim \mu =t^{-\Phi },(1<\Phi \le 3) \end{aligned}$$
(12)

where \(\Phi\) is the expected outcome of an event.

Local search phase: The algorithm’s expt actions are reflected in this phase. The solution \(o_i^{t+1}\) is built in this phase on the basis of two random solutions \(x_p^t\), \(x_q^t\) from the total population. For this stage, the following equation will produce a solution:

$$\begin{aligned} x_{i}^{t+1}=o_i^t+\alpha \otimes (\epsilon )\otimes (x_p^t-x_q^t) \end{aligned}$$
(13)

where \(\epsilon \in [0,1]\).

Honey badger algorithm

Honey badger (HB) is a fluffy black and white mammal that lives in semideserts and rain forests in Southwest Asia, Africa, and the Indian subcontinent. HBA mimics a HB’s foraging behaviour. The HB either smells and digs to find food sources, or it follows a honey-guide bird. The first situation is referred to as digging and honey mode. Previously, it senses smell to roughly localise the prey; then chooses the best digging spot so as to capture it. In the latter form, it uses a honeyguide to locate a beehive directly.

Algorithmic Steps: Theoretically, HBA includes both the expl and expt stages.

Population = \(\begin{bmatrix} o_{11} &{}\quad o_{12} &{}\quad o_{13} &{}\quad ... &{}\quad o_{1D}\\ o_{21} &{}\quad o_{22} &{}\quad o_{23} &{}\quad ... &{}\quad o_{2D}\\ ... &{}\quad ... &{}\quad ... &{}\quad ... &{}\quad ... \\ o_{n1} &{}\quad o_{n2} &{}\quad o_{n3} &{}\quad ... &{}\quad o_{nD}\\ \end{bmatrix}\)

i th position of HB \(o_i\) = \([o_i^1,o_1^2,....,o_i^D]\)

  1. 1)

    Initialization phase: Set HBs and their locations to their initial values based on Eq. (14).

    $$\begin{aligned} x_{i}=lb_{i}+r_{1} * (ub_{i} - lb_{i}) \end{aligned}$$
    (14)

    where \(r_1 \in [0,1]\), \(o_i\) is i th HBs position, \(lb_i\) is lower bound and \(ub_i\) is upper bound.

  2. 2)

    Intensity(I): This is the difference between prey and HB and is given by

    $$\begin{aligned}{} & {} I_i = r_2 * S/4\pi g_i^2 \end{aligned}$$
    (15)
    $$\begin{aligned}{} & {} S=(o_i - o_{I+1})^2 \end{aligned}$$
    (16)
    $$\begin{aligned}{} & {} g_i = x_{prey} -o_i \end{aligned}$$
    (17)

    where S defines source strength and \(g_i\) is the prey’s and i th HB’s distance.

  3. 3)

    Density Factor \((\alpha )\): To guarantee a transition from expl to expt, the density factor regulates randomization.

    $$\begin{aligned} \alpha = C*exp(-t/t_{max}) \end{aligned}$$
    (18)

    where \(t_{max}\) is the maximum iterations, \(C \ge 2\)

  4. 4)

    Escaping Local Optimum: The three steps that follow are utilized to leave local optima zones by employing a flag H that modifies the search for higher expl.

  5. 5)

    Updating Positions: It constitutes two phases;

  6. 6)

    Digging Phase: The intensity of the prey’s smell I, HB’s and the prey \(d_i\) distance, and the influence of \(\alpha\) are all important in the digging phase for HBs.

    $$\begin{aligned} x_{new}=x_{prey} + H*I*\beta *x_{prey} +H *\alpha *r_3*g_i * |cos(2\pi r_4) *[1-cos(2\pi r_5]| \end{aligned}$$
    (19)

    where \(x_{prey}\) signify the location of prey; \(\beta\) \(\ge\) 1 (default = 6) is HBs capability to search food; \(g_i\) denotes the distance between the i th HB and prey; \(r3, r4, r5 \in [0, 1]\).

  7. 7)

    Honey Phase: We can model the scenario where an HB pursues a honeyguide bird to a beehive as;

    $$\begin{aligned} x_{new} =x_{prey}+H*r_7*\alpha *g_i \end{aligned}$$
    (20)

    where \(x_{new}\) is the new position of HB, and \(x_{prey}\) is prey location.

Dwarf mongoose algorithm

The tiniest carnivore found in Africa is the dwarf mongoose (DM). Due to their territorial nature, DM’s frequently mark horizontal objects in their domain with their cheek and anal glands.

The suggested DMO algorithm consists of the DM’s compensatory behavioural adaptation. Limiting the size of the prey, social behaviour (babysitters), semi-nomadic living, and other adaptations are examples of compensatory behaviour. The model consists of DMs as scouts, alpha group, and babysitters. The DMO algorithm starts with initialising the candidate population of the DMs (X)

$$\begin{aligned} O = \begin{bmatrix} m_{1,1} &{}\quad m_{1,2} &{}\quad ... &{}\quad m_{1,D-1} &{}\quad m_{1,D}\\ m_{2,1} &{}\quad m_{2,2} &{}\quad ... &{}\quad m_{2,D-1} &{}\quad m_{2,D}\\ ... &{}\quad ... &{}\quad m_{i,j} &{}\quad ... &{}\quad ... \\ m_{n,1} &{}\quad m_{n,2} &{}\quad ... &{}\quad m_{n,D-1} &{}\quad m_{n,D}\\ \end{bmatrix} \end{aligned}$$
(21)

where O is the populations generated randomly, \(m_{i,j}\) signifies the location of the j th dimension of the i th member, n and D denote the pop size and dimension, respectively.

The proposed DMO algorithm’s optimization processes are divided into three phases.

Alpha group

The alpha female (\(\alpha\)) is chosen according to the probability calculated for each population fitness.

$$\begin{aligned} \alpha =\frac{fit_i}{\sum _{i=1}^{n}fit_i} \end{aligned}$$
(22)

If bs represents the number of babysitters, then \(n-bs\) gives the DMs in the alpha group. Every DM sleeps in the initial sleeping mound, which is set to \(\phi\).

$$\begin{aligned} o_{I+1} = o_i +\phi *peep \end{aligned}$$
(23)

where \(\phi \in [-1, 1]\)

The sleeping mound is given as \(sm_i\)

$$\begin{aligned} sm_i=\frac{fit_{i+1} - fit_i}{\max \{fit_{i+1},fit_i\}} \end{aligned}$$
(24)

The sleeping mound’s average value is provided by;

$$\begin{aligned} \varphi =\frac{\sum _{i=1}^{n}sm_i}{n} \end{aligned}$$
(25)

Once the criterion is satisfied, DMO hops to the scout phase, where another source is assessed.

Scout group

The DMs are known to avoid returning to the former sleeping mound, so the scouts search for an adjacent one to ensure expl. This is modelled as

$$\begin{aligned} o_{I+1}= \left\{ \begin{array}{ll} o_i - CD*phi*rand*[o_i-\textbf{M}] &{}\quad \text{ if } \varphi _{i+1}>\varphi _i \\ o_i + CD*phi*rand*[o_i-\textbf{M}] &{}\quad \text{ else } \end{array} \right. \end{aligned}$$
(26)

where rand denotes a random number in [0, 1], CD = \(\Big (1-\frac{iter}{Max_{iter}}\Big )^{\Big (2\frac{iter}{Max_{iter}}\Big )}\) regulates the collective-volatile movement of DM’s, and is decreased linearly. \(\textbf{M} = \sum _{i=1}^{n} \frac{o_i*sm_i}{o_i}\) determines the movement of the DM towards the new food source.

The babysitters

The group’s subordinate members who stay with the babies are typically rotated on a regular basis as the babysitters, allowing the alpha to guide the other group members. She usually comes back for nursing in the afternoon/evening. Total babysitters vary with pop size; and have an impact on DMO by decreasing the pop size according to the preset percentage. The exchange parameter helps reset the information between previous and current family members.

Prairie dog optimization algorithm

Prairie dogs (PDs) are quite sociable and prefer to live together underground in large colonies. A colony usually houses 15-26 family units or coteries, with each coterie residing into its respective ward. The functionality and complexity of the subunits of a colony are same irrespective of the colony’s size.

In PDO40, PD populations are search agents, whereas their location represents the possible solution. The mathematical model of PDO is described below:

Initialization

Consider n number of PD in a coterie where each PD belongs to m coterie. The location of all coteries in a colony can be represented by a matrix CO given as:

$$\begin{aligned} CO = \begin{bmatrix} CO_{1,1} &{}\quad CO_{1,2} &{}\quad ... &{}\quad CO_{1,D-1} &{}\quad CO_{1,D}\\ CO_{2,1} &{}\quad CO_{2,2} &{}\quad ... &{}\quad CO_{2,D-1} &{}\quad CO_{2,D}\\ ... &{}\quad ... &{}\quad CO_{i,j} &{}\quad ... &{}\quad ... \\ CO_{m,1} &{}\quad CO_{m,2} &{}\quad ... &{}\quad CO_{m,D-1} &{}\quad CO_{m,D}\\ \end{bmatrix} \end{aligned}$$
(27)

where \(CO_{i,j}\) denotes the jth dimension of the ith coterie. The location of all the PDs in a coterie can be represented by:

$$\begin{aligned} LO = \begin{bmatrix} LO_{1,1} &{}\quad LO_{1,2} &{}\quad ... &{}\quad LO_{1,D-1} &{}\quad LO_{1,D}\\ LO_{2,1} &{}\quad LO_{2,2} &{}\quad ... &{}\quad LO_{2,D-1} &{}\quad LO_{2,D}\\ ... &{}\quad ... &{}\quad LO_{i,j} &{}\quad ... &{}\quad ... \\ LO_{n,1} &{}\quad LO_{n,2} &{}\quad ... &{}\quad LO_{n,D-1} &{}\quad LO_{n,D}\\ \end{bmatrix} \end{aligned}$$
(28)

where \(LO_{i,j}\) denotes the jth dimension of the ith PD such that \(n \le m\). If U(0, 1) denotes a random number with a uniform distribution, then each coterie and PD location is given by:

$$\begin{aligned} CO_{i,j}=U(0,1) \times (UB_j -LB_j)+LB_j \end{aligned}$$
(29)
$$\begin{aligned} LO_{i,j}=U(0,1) \times (ub_j -lb_j)+lb_j \end{aligned}$$
(30)

where \(ub_j=\frac{UB_j}{m}\) and \(lb_j=\frac{LB_j}{m}\), such that \(UB_j\) and \(LB_j\) represent the upper and lower bounds of the jth dimension, respectively.

Fitness function evaluation

The value of fitness function of each PD location in a coterie is calculated by feeding the optimal solution to the fitness function defined as

$$\begin{aligned} fit(LO) = \begin{bmatrix} fit_1(LO_{1,1} &{}\quad LO_{1,2} &{}\quad ... &{}\quad LO_{1,D-1} &{}\quad LO_{1,D})\\ fit_2(LO_{2,1} &{}\quad LO_{2,2} &{}\quad ... &{}\quad LO_{2,D-1} &{}\quad LO_{2,D})\\ fit_n(LO_{n,1} &{}\quad LO_{n,2} &{}\quad ... &{}\quad LO_{n,D-1} &{}\quad LO_{n,D})\\ \end{bmatrix} \end{aligned}$$
(31)

At each iteration, the fitness function values are evaluated for all PDs and stored as \((n\times 1)\) matrix. Here, each fitness value represents the food source quality, the ability to build new burrows, and the apt response to the predators.

Exploration

The exploration operation is carried in \(0<iter<\frac{T}{4}\) and \(\frac{T}{4}\le iter<\frac{T}{2}\) intervals. The first step in this phase is the movement of PDs of a coterie from the ward in search of food. The position updating for the search can be expressed as

$$\begin{aligned} LO_{i+1}^{j+1}=gLO_{i,j}^{Best}-ecLO_{i,j}^{Best} \times \epsilon - CLO_{i,j} \times Levy(n) \ \forall \ 0<iter<\frac{T}{4} \end{aligned}$$
(32)

where \(gLO_{i,j}^{Best}\) represents the globally best solution so far, \(\epsilon\) denotes the food source alarm and Lévy(n) represents the Lévy(n) distribution. Here, \(ecLO_{i,j}^{Best}\) signifies the effect of current best solution which is defined as

$$\begin{aligned} ecLO_{i,j}^{Best}=gLO_{i,j}^{Best} \times \tau + \frac{LO_{i,j}\times mean(LO_{n,m})}{gLO_{i,j}^{Best}\times (UB_j-LB_j)+\tau } \end{aligned}$$
(33)

where \(\tau\) signifies the individual PD position difference. Also, the collective impact of all the PDs in the colony, \(CLO_{i,j}\), is given by

$$\begin{aligned} CLO_{i,j}=\frac{gLO_{i,j}^{Best}-rLO_{i,j}}{gLO_{i,j}^{Best}+\tau } \end{aligned}$$
(34)

where \(rLO_{i,j}\) is the random position of the ith PD in the jth dimension.

The second step is to evaluate the food quality along with the digging strength in order to build new burrows. The position updating for the building of a burrow can be expressed as

$$\begin{aligned} LO_{i+1}^{j+1}=gLO_{i,j}^{Best} \times rLO \times DS \times Levy(n) \ \forall \ \frac{T}{4}\le iter<\frac{T}{2} \end{aligned}$$
(35)

where DS denotes the digging strength as defined below

$$\begin{aligned} DS=1.5 \times s \times \left( 1-\frac{iter}{Max_{iter}}\right) ^{\left( 2\frac{iter}{Max_{iter}}\right) } \end{aligned}$$
(36)

where s can be either -1 or 1 according to the odd or even current iteration, respectively.

Exploitation

In PDO, the exploitation mechanism is utilised to focus the search on promising locations identified in the previous phase. It is implemented according to the equations given below

$$\begin{aligned}{} & {} LO_{i+1}^{j+1}=gLO_{i,j}^{Best}-ecLO_{i,j}^{Best} \times \rho - CLO_{i,j} \times rand \ \forall \ \frac{Max_{iter}}{2}\le iter<3\frac{Max_{iter}}{4} \end{aligned}$$
(37)
$$\begin{aligned}{} & {} LO_{i+1}^{j+1}=gLO_{i,j}^{Best} \times PR \times rand \ \forall \ 3\frac{Max_{iter}}{4} \le iter \le Max_{iter} \end{aligned}$$
(38)

where \(\rho\) represents the food source quality, rand is a random number between 0 and 1 and PR is the effect of predators which can be expressed as

$$\begin{aligned} PR=1.5 \times (1-\frac{iter}{Max_{iter}})^{(2\frac{iter}{Max_{iter}})}. \end{aligned}$$
(39)

Summary of literature

In the above sections, we have introduced the basic algorithms used for the formulation of the proposed algorithm. Apart from the introduction of a new algorithm, the proposed algorithm has also been applied to CEC benchmarks and image fusion problems. A recent list of applications with some modified algorithms in presented in Table 1.

Table 1 Recent literature on algorithms and their applications.

A more detailed review of region-based image fusion is given in Sect. 5. The literature discussed in this section provides some insights on the applicability of the recently introduced algorithms on image segmentation problems. A major research gap in most of the works discussed above is in the implementation part of the algorithms. It has been found in most of the literature that the algorithms proposed are either simple modifications in the basic algorithm, enhancements in the parameters or merely an adaptation in a certain section of the algorithm. There is very limited work on the actual modification aspects, or mainly hybridization of algorithms pertaining to image thresholding and segmentation problems. So in the present work, a multi-hybrid algorithm with adaptive properties is proposed. In the next section, extensive details on how the proposed algorithm is formulated and every minute detail on why’s and how’s of this proposal are formulated.

The proposed approach

Among the recently introduced algorithms, NMRA has been found to provide amazing expl and expt capabilities. The algorithm is highly reliable when compared with respect to the recently introduced CS, GWO, WOA and other algorithms. A comparison with the hybrid and enhanced versions shows that the algorithm has certain disadvantages too, and new improvements are required to make the algorithm self-sufficient in itself. One of the major disadvantages is the prevalence of poor expl operation of NMRA, which has been proved and highlighted in various recently introduced enhanced versions of NMRA38. It was analysed that due to less randomness in the solution space of the worker phase, and hence local optima stagnation. However, it can be improved by the addition of new prospective equations in the general working phase of the algorithm, thus, enhancing its global search properties. Also, parameters need to be enhanced, and self-adaptivity must be ensured so that no amendments are required. Based on this and the added advantages of new equations inspired by DMA, HBA, PDO, CS and GWO, a new multi-hybrid algorithm namely Dwarf Honey Parairie Naked mole-rat (DHPN) algorithm is proposed. The major highlights of this algorithm are

  • Follows the basic structure of NMRA, and new modified equations are included in the worker phase. For the first one-fourth of iterations, basic NMRA equations are used, PDO-inspired equations are used for the second one-third of the iterations, for the third one-fourth of the iterations, DMA and for the final phase of iterations HBA, HBA-based equations are used.

  • Equations of both CS and GWO are used in a hybrid manner, inspired by self-adaptive cuckoo search algorithm (SACS)56, and are meant for updating the whole solution set if the algorithm gets stuck. That is, if the solution quality doesn’t improve for certain iterations, the stagnation phase is activated and hence serves as the best player for avoiding local optima stagnation.

  • For adding self-adaptive properties, six different mutation operators are added to the basic random parameters of the DHPN algorithm. All these mutation operators have algorithms that have been exploited in the literature and more details are presented in subsequent subsections.

A detailed discussion of what’s, how and why’s of the requirement of the proposal is given in the next subsections.

What is the requirement of the proposal

In the recent literature, it has been found that new algorithms are being proposed and added to the expanding literature. However, the performance evaluation of these algorithms is limited to certain basic optimization algorithms only, and a comparison with respect to recent hybrid algorithms is missing. Even in some cases, the comparison is present, but enhanced search shows that there isn’t any significant improvement in the performance of newly proposed algorithms. Another thing that has been pointed out in the literature is the prevalence of problem-based modification as pointed out by the no free lunch theorem (NFL)57. According to NFL, no algorithm is perfect for all problems and user-based enhancements are required to fit it to a certain domain research problem. This is because every optimization problem consists of a different scenario, including variable dimension size, constrained or unconstrained nature, computational complexity, scalability, and others. The total number of local minima also poses a significant challenge to solving these problems. This provides us with enough evidence and motivation for the proposal of new algorithms. Apart from the generalized scenarios, NMRA also has the drawback of poor expl, and user-based modification is required to improve its expl properties. Why NMRA has poor expl?, it is because of the lesser random nature of the solutions and the problem of stagnation38. Apart from these problems, the basic algorithm uses only random initialization of parameters, which makes it very difficult to identify which set of combinations fits the best for the used parameters. A simple constant value is employed in most cases, thus following constant step sizes and restricting the search of the algorithm to particular regions. Adding a new combination of mutation operators using self-adaptive formulations makes the algorithm follow variable steps and provide excellent expl and expt properties. This provides us with enough motivation to propose a new prospective algorithm. In the present case, we present a novel DHPN algorithm based on the added properties of different algorithms and mutation operators. Modifications to the new equation are added in the global search or worker phase. The breeder phase is kept as such, and no equations are modified. More details on how the modifications have been added are provided in the next subsections.

Motivation behind the proposal

Based on the rise in hybridization among optimization researchers, some of the most successful results for combinatorial and practical problems have been achieved through hybrid algorithms. One of the earliest instances of algorithm combination involved simulated annealing, genetic algorithms, tabu search, descent local search, and evolutionary algorithms, yielding notable outcomes58. The taxonomy of heuristic algorithms comprises hierarchical and flat classifications. The hierarchical level reduces the number of classes, while the flat level arranges techniques in an arbitrary order. In hierarchical taxonomy, low-level and high-level hybridization are distinguished. Low-level hybridization involves replacing a portion of the algorithm with another, whereas high-level hybridization involves self-contained algorithms with no direct internal relationship. Further, low-level and high-level hybridizations are categorized as Relay and Teamwork hybridization. Relay hybridization employs multiple algorithms sequentially, with the output of one serving as the input for the next, akin to a pipelined operation. On the other hand, Teamwork hybridization employs several parallel cooperating algorithms, each conducting an independent search. Overall, these characteristics classify hybrid algorithms into four types58.

  • LRH : Low-level relay hybrid

  • LTH : Low-level teamwork hybrid

  • HRH : High-level relay hybrid

  • HTH : High-level teamwork hybrid

In LRH, a single-solution algorithm incorporates an embedded algorithm. This approach is exemplified in the combination of simulated annealing with local search for solving the travelling salesman problem59. LRH models typically balance exploration and exploitation operations by utilizing different algorithms. However, since heuristics are often stronger in exploration than in exploitation, in LTH, one algorithm handles exploration while another tackles exploitation. For instance, Chu et al.60 proposed an LTH algorithm by integrating a generalized GA with tabu search for mutation and hill-climbing for crossover operations. On the other hand, HRH evaluates self-sufficient algorithms sequentially. In a study by61, tabu search and simulated annealing were employed to enhance the population generated by a GA over iterations. In HTH, several self-sufficient algorithms collaborate in parallel, each contributing to the search operation. For example, Cohoon and Hegde62 applied a GA as the base algorithm, with sections of the population being evaluated using simulated annealing, genetic programming, evolution strategy, and tabu search to improve overall performance.

In this study, we propose a novel algorithm based on the LTH model. Our approach combines multiple algorithms to execute the exploration operation, with each algorithm providing distinct solutions. These solutions are subsequently evaluated by another algorithm during the exploitation phase. Specifically, we employ equations inspired by PDO, DMA, HBA, and NMRA for the enhanced worker phase (exploration), while the exploitation phase leverages breeder equations from NMRA. Within a single iteration, one set of equations, inspired by one algorithm, conducts the exploration operation, while the other set handles exploitation. After a predefined number of iterations, the first algorithm is substituted with a new one, gradually transitioning to a multi-hybrid approach. By utilizing a combination of algorithms and employing a teamwork strategy, our method falls within the LTH model. This approach, integrating two or more algorithms, facilitates effective exploration and exploitation, enhancing the overall optimization process.

In a generalized algorithm, the aim is to conduct extensive exploration in the initial phases, gradually transitioning towards increased exploitation, and finally emphasizing extensive exploitation in the later stages. In our approach, we utilize equations inspired by DMA, HBA, PDO, and NMRA for iterative search, while employing CS-GWO-based equations for the stagnation phase. Before implementing the multi-hybrid algorithm, a preliminary study is essential to determine which algorithms excel in exploration, exploitation, and facilitating the transition between the two. Thus, our study proposes a multi-hybrid algorithm that incorporates the most effective equations for exploration and exploitation. We divide the iterations into four distinct phases, with each phase employing a specific set of equations from one of the algorithms. Additionally, we consider the parameters of each algorithm to optimize the performance of the proposed new algorithm. This approach aims to leverage the strengths of different algorithms while ensuring an effective balance between exploration and exploitation throughout the optimization process.

The proposed approach

As already stated, the DHPN algorithm has the added advantages of PDO, DMA, HBA, and CS/GWO inspired self-adaptive cuckoo search (SACS)56. In this section, we deal with the detailed study of the proposal of the DHPN algorithm. It consists of five different phases, where initialization is the first phase, and the second worker phase incorporates some major changes and is the main phase. This phase provides excellent expl properties and incorporates major details using all the new algorithms under consideration. The next phase is the breeder phase, which has equations of NMRA. The fourth phase is meant for the selection of the best individuals over the course of iterations. A new phase inspired by the SACS algorithm is incorporated as a stagnation phase and is meant to improve the local search capabilities and also help to counter the local optima problems. A new subsection is added to analyse the parameters of the proposed algorithm. Details about the algorithm are presented below:

Initialization of the proposed DHPN algorithm

Initialization stands for the generation of randomized solutions for a D dimensional problem within a certain range defined by \(N_{min,j}\) lower and \(N_{max,j}\) upper bounds of the problem. The mathematical formulation is given by

$$\begin{aligned} N_{i,j}=N_{min,j}+r\times (N_{min,j}-N_{max,j}) \end{aligned}$$
(40)

where \(j \in [1,2,..., D]\), \(i \in [1,2,..., n]\), and \(r \in [0,1]\).

Worker phase

The second and most interesting phase of the DHPN algorithm is the worker phase. It is for better expl properties and forms the core of the proposed algorithm. This phase is divided into four sub-phases.

a) Phase I: For Iterations. \(\le\) \(t_{max}/4\). The first phase of the worker group follows the same equations as used in the basic NMRA. These equations are based on two random solutions initialized within the search space. The two solutions are random, which makes the algorithm highly diverse in nature. This phase is mainly meant for explorative tendencies, and due to the presence of diverse solutions, we can achieve it conveniently. The mathematical equation is given by

$$\begin{aligned} x_{i}^{t+1}=x_{i}^{t}+\lambda (x_p^t-x_q^t) \end{aligned}$$
(41)

where \(\lambda\) is a self-adaptive parameter analysed in consecutive subsections.

b) Phase II: For \(t_{max}/4\) < iterations \(\le\) \(t_{max}/2\). This section uses the PDO40 algorithm for dedicated expl operation and is specifically meant for extensive global search. This search operation is to find solutions in close vicinity and in particular sections of the search space. The strategy is modelled using two equations of the PDO and is mathematically given by

$$\begin{aligned}{} & {} if rand < 0.5 \nonumber \\{} & {} if \ abs(A) \ge 1 \nonumber \\{} & {} o_i^{t+1} = x_{best} - n_{best} \times l - o_i^t \times rand() \nonumber \\{} & {} else \nonumber \\{} & {} o_i^{t+1} = x_best \times ge \times rand()\nonumber \\{} & {} elseif \ rand \ge 0.5 \nonumber \\{} & {} de = abs(x_{best} - o_i^t)\nonumber \\{} & {} o_i^{t+1} = de \times exp(l).\times cos(l.2\pi ) + x_{best}\nonumber \\{} & {} end \nonumber \\{} & {} end \end{aligned}$$
(42)

where \(n_{best}\) is the local best solution, \(o_i^t\) is a current solution, l is the parameter of PDO and is made self-adaptive by using different mutation operators (discussed in consecutive sections), A is a random number.

c) Phase III: For \(t_{max}/2\) < iterations \(\le\) \(t_{max} 3/4\). This phase is controlled by DMA and is used for enhancing the expl operation with the advantages of expt operation. The DHPN algorithm uses a combination of expl and expt inspired scout DM phase for position update and is given by

$$\begin{aligned} o_i^{t+1}= {\left\{ \begin{array}{ll} o_i - CF \times rand() \times [x_{best} -o_i] \hspace{5pt} if rand()> 0.5 \hspace{5pt} \textit{expl} \\ o_i + CF \times rand() \times [x_{best} -o_i] \hspace{5pt} if rand() > 0.5 \hspace{5pt} \textit{expt} \end{array}\right. } \end{aligned}$$
(43)

\(rand = [0,1]\), and CF is made self-adaptive in nature using different mutation operators. More details on the added parameters is presented in consecutive subsections.

d) Phase IV: iterations > \(t_{max} 3/4\). The final phase of iterations is controlled by using HBA. Here, digging phase and honey phase of HBA is used to formulate the basic equations of this phase. The mathematical formulation is given by

$$\begin{aligned}{} & {} if \ r < 0.5 \nonumber \\{} & {} o_i^{t+1} = o_i^t + g \times \beta \times I \times o_i^t + g \times rand() \times \alpha \times d_i \times (cos(2 \pi g)) \nonumber \\{} & {} else \nonumber \\{} & {} o_i^{t+1} = o_i^t + F \times \beta \times I \times o_i^t + g \times rand() \times \alpha \times d_i \times (cos(2 \pi g)) \times |[1-cos(2 \pi g)]| \nonumber \\{} & {} end \end{aligned}$$
(44)

where \(d_i = x_{best}- o_i^t\) is the distance to the best solution, \(\beta , and I\) is chosen as 1, \(\alpha\) is a random control parameter, and it decreases over iterations to reduce the diversity, g is the major parameter of HBA and is optimized using a different set of mutation/iw operators. This parameter helps the search agents to change their direction for rigorous expl. Apart from that, the major reason for using the digging and honey phase of HBA is because of both expl and expt search in the basic equations.

Breeder phase

This phase is meant to provide better expt operation and is evaluated using the current best solutions. Apart from that, \(\lambda\) controls the extent of expt in the breeder phase. The major reason why this phase is kept as such is because of the inherent properties of a limited number of breeders that remain concentrated around the search space, and hence corresponds to better potential solutions around those sections. In a general scenario, the search agents look for potential solutions that are close to the current/previous best. This helps to exploit the search space efficiently and, hence, improves search capabilities.

Selection operation

This phase is meant for finding the best solution. Here, the best solutions from the previous and current are compared based on fitness, and the best among both is retained.

$$o_{i}^{{t + 1}} = \left\{ {\begin{array}{*{20}l} {o_{i}^{t} } \hfill & {if\;f\left( {o_{i}^{{t + 1}} } \right) < f(o_{i}^{t} )} \hfill \\ {o_{i}^{{t + 1}} } \hfill & {otherwise} \hfill \\ \end{array} } \right.$$
(45)

where \(f(o_i^t)\) is the fitness of the previous solution and \(f(o_i^{t+1})\) signifies the fitness of the current solution.

Stagnation phase

To deal with the problems of stagnation, a new phase is incorporated into the proposed DHPN algorithm. This phase uses a combination of SACS56 inspired equations for better performance. This phase is activated only if the solution quality is not improving. This helps the algorithm to improve and produce good solutions. Its general equation is given by,

$$\begin{aligned}{} & {} o_1=o_i-G_1(Y_1.x_{new}-o_i^t);\hspace{5pt} o_2=o_i-G_2(Y_2.x_{new}-o_i^t);\hspace{5pt} o_3=o_i-G_3(Y_3.x_{new}-o_i^t) \end{aligned}$$
(46)
$$\begin{aligned}{} & {} x_{i}^{t+1}=\frac{o_1+o_2+o_3}{3} \end{aligned}$$
(47)

where \(G_1, G_2, G_3 \in A\) and \(Y_1,Y_2,Y_3 \in C\) are given by \(A=2l.r_1-l;\hspace{5pt} C=2.r_2\). The Eq. (47) is adapted using Cauchy \(Cauchy(\delta )\) distributed random parameter and the new equation is

$$\begin{aligned} o_i^{t+1}=o_i^t+\alpha \otimes {Cauchy(\delta )}(n_{best}-o_i^t) \end{aligned}$$
(48)

The equation for Cauchy distribution is given by

$$\begin{aligned} f_{Cauchy(0,g)}(\delta )=\frac{1}{\pi } \frac{g}{\Big ({g^2}+{\delta ^2}\Big )} \end{aligned}$$
(49)

The Cauchy distribution function is

$$\begin{aligned} y=\frac{1}{2}+\frac{1}{\pi }arctan\left( \frac{\delta }{g}\right) \end{aligned}$$
(50)

And \(Cauchy(\delta )\) operator is expressed as

$$\begin{aligned} \delta =tan\left( \pi \Big (y-\frac{1}{2}\Big )\right) \end{aligned}$$
(51)

Here \(\delta\) is added because of its fatter tail, and it helps the algorithm to provide better expl of the search space. This helps in avoiding local optima and premature convergence. A significant problem to deal with is when to activate this phase. The question is still under consideration, and in our current work, we are activating it if the solution does not change after 10 iterations.

Parameter settings

The proposed DHPN uses a combination of six new mutation operators/inertia weights (iw) for analysing the performance of its six parameters (\(\lambda\), CF, P, R, g and l), and making them adaptive in nature. The mutation operators introduced include, simulated annealing (sa), chaotic, exponential decreasing (exp), linearly decreasing (lin), oscillating, and random iws. Note that random weights are added only to see how the algorithm behaves if no adaptive parameter is added. The iws are formally given as:

Simulated annealing (sa) iw

This iw is meant for providing better expl and expt38 and is mathematically given by

$$\begin{aligned} \zeta _k= \zeta _{min}+(\zeta _{max}-\zeta _{min})\times p^{(k-1)} \end{aligned}$$
(52)

where \(\zeta _{min}\), \(\zeta _{max} \in [0, 1]\) and \(k = 0.95\).

Chaotic iw

This iw is meant for improving the global search41, and is done by guiding the algorithm away from the local optima.

$$\begin{aligned}{} & {} K=4 \times k \times (1-k) \end{aligned}$$
(53)
$$\begin{aligned}{} & {} \zeta = (\zeta _{max}-\zeta _{min}) \times \frac{t_{max}-t}{t_{max}} + \beta _{min} \times K \end{aligned}$$
(54)

where \(k \in [0, 1]\), \(\beta _{max} = 0.9\), \(\beta _{min} = 0.5\), \(t_{max}\) is maximum iterations.

Exponential decreasing (exp) iw

Exponential iw helps in slowly moving the algorithm from expl towards expt. This helps to achieve better convergence patterns and hence has better expl towards the start and expt towards the end42. It is mathematically given as:

$$\begin{aligned} \zeta (t)=\zeta _{min}+(\zeta _{max}-\zeta _{min})exp\left[ -\frac{t}{(\frac{t_{max}}{10})}\right] \end{aligned}$$
(55)

where \(\zeta _{min}\) and \(\zeta _{max}\) lies between [0,1].

Linearly decreasing (linear) iw

This iw follows a linear pattern for adaptation of parameters, and helps in providing a transitional change from expl towards expt38. In general, this iw improve the search for global solutions with better convergence speed. This is given by:

$$\begin{aligned} lin = \Bigg (1- \frac{t}{t_{max}}\Big )^{\Big (2 \times \frac{t}{t_{max}}\Big )} \end{aligned}$$
(56)

Oscillating iw

This iw generates periodic waves for balanced expl and expt, and is mathematically modelled as:

$$\begin{aligned}{} & {} \zeta (t)=\frac{\beta _{min}+\zeta _{max}}{2}+\frac{\zeta _{max}-\zeta _{min}}{2}cos\Bigg (\frac{2\pi \,t}{T}\Bigg ) \end{aligned}$$
(57)
$$\begin{aligned}{} & {} T=\frac{2 \times t_{max}}{3+2k} \end{aligned}$$
(58)

where \(\zeta _{max} = 0.9\) and \(\zeta _{min} = 0.3\), and \(k \in [0, 1]\).

The flowchart of the proposed algorithm is given in Fig. 1.

Figure 1
figure 1

Flowchart of DHPN algorithm.

The pseudocode of the proposed algorithm is given in Algorithm 1.

Algorithm 1
figure a

Pseudocode of DHPN algorithm

Computational complexity

In this section, the computational complexity of the proposed algorithm is analysed with respect to that of the basic algorithm NMRA. If n is the size of the population, d signifies the dimensionality of the problem and \(t_{max}\) represents the maximum number of iterations to find the global optimum, then the computational complexity of the basic NMRA is expressed as O(\(n.d.t_{max}\))37. The complexity analysis is done to examine the operation of an algorithm with worst-case complexities and find the run-time of an algorithm. In the case of a fixed problem dimension, the complexity stands at O(d) for an individual population member. However, when the algorithm uses multiple search agents, the complexity increases to O(n.d), accounting for the population’s size. Given the stochastic nature of the algorithm, evaluation of \(t_{max}\) iterations results in an overall complexity of O(\(n.d.t_{max}\)).

In contrast to the original NMRA, the exploration operation in the proposed DHPN algorithm is divided into multiple iterations, but the total number of iterations is kept fixed at \(t_{max}\). Thus, there is no change in complexity due to this added adaptation. In terms of addition of stagnation phase, the complexity is given by O(n.d) and is equal to 1. This is so because the stagnation phase is activated only if the algorithm gets stuck for certain iterations and the new solution is generated only once. Hence, the overall computational complexity of the proposed DHPN algorithm is the same as that of the basic NMRA.

Results and discussion

This section presents the analysis of DHPN algorithm for different benchmark suites to confirm its efficiency over other algorithms. The section has eight parts, the first subsection consists of CEC 2005 benchmarks. The second and third subsections give the parametric details and analysis of the algorithm under test, respectively. The sensitivity analysis of pop size and dimension size is analysed in the fourth and fifth parts, respectively. The sixth and seventh subsections show the comparative analysis of CEC 2005 and CEC 2019 benchmarks. Finally, the convergence profiles are discussed in the eighth subsection.

The experimental study was performed using a 64-bit Windows 10 operating system, Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz processor, 8.00 GB RAM. Its source code was implemented using MATLAB (R2022b).

Test suite

It gives details of CEC 2005 benchmarks to analyse the efficiency of the DHPN algorithm. Table 2 gives the description of test functions. These functions can be broadly categorized as unimodal (UM) problems, multimodal (MM) problems, and fixed dimension (FD) problems48. The UM problems (G1–G7) are simplex functions with one global minimal solution and are used to analyse expl, whereas MM problems \((G8-G12)\) are used to test the expl capability. Such functions have multiple local minimal solutions. On the other hand, as the name suggests, dimension size is fixed for FD functions \((G13-G15)\). These functions analyse the consistency in finding a global solution.

Table 2 CEC 2005 Benchmark Dataset.

Preliminary parameter settings

This section describes parameters used for comparative analysis with the proposed DHPN algorithm. For comparison with CEC 2005 benchmarks, success-history based DE (SHADE)63, DE with external archive (JADE)50, self adaptive DE (SaDE)64, opposition and exponential WOA (OEWOA)65, fractional-order calculus-based FPA (FA-FPO)66, sine cosine crow search (SCCSA)67, evolution strategy with covariance adaptation (CMA-ES)68, extended GWO (GWO-E)69, LSHADE-SPACMA52, optimization (DMO) algorithm39, HBA36 and cuckoo-search (CS) algorithm70 are used. Secondly, for CEC 2019 benchmark problems, DHPN is compared with DE, self-adaptive jDE10071, particle swarm optimization (PSO)72, Young’s double-slit experiment inspired optimizer (YDSE)73, naked mole-rat algorithm (NMRA)37 and prairie dog optimization (PDO)40 algorithm. The parameter details of aforementioned algorithms are based on their corresponding papers and are also provided in Table 3.

Table 3 Parameter settings of different algorithms.

Sensitivity analysis: parametric study of DHPN algorithm

DHPN algorithm has six important parameters namely \(\lambda\), CF, P, \(\textbf{R}\), l, and g. These parameters are subjected to mutation operators for 3 UM functions (G2, G6 & G7), 1 MM function (G11) and 1 FD function (G13) of CEC 2005 benchmarks. Results are evaluated as mean values and standard deviation (std) values. In addition, Friedman rank (f-rank)79 test values are given in Table 4.

The parameter \(\lambda\) is associated with three mutations: simulated annealing sa inertia weight (iw), chaotic and exponential decreasing exp iws. Table 4 clearly shows that for \(\lambda\), sa mutation operator perform the best for G2 and G13, whereas exp mutation operator performs better for G6 and G7. For G11, the parameter \(\lambda\) yields the best performance with chaotic iw in comparison to the other operators. Overall, The parameter lambda with chaotic mutation operator is the best strategy.

CF is checked for three different cases: linear, chaotic iw and exp iw. Results given in Table 4 depict that the chaotic iw yields the best solutions for G2 and G6 for the parameter CF. For G7, G11, and G13, exp iw peforms better compared to the other operators for the parameter CF. Overall, the parameter CF with exp iw is found to be the best.

P is analysed for a constant iw, chaotic iw and exp iw. Table 4 shows that the exp iw outperforms other operators for G6, G11 and G13. For G2, the parameter P with constant number yields better results. For G7, the parameter P with chaotic iw gives the best results. Overall, the parameter P with exp iw outperforms the other operators, and the F-rank statistical test validates the results.

\(\textbf{R}\) is analysed for random iw, chaotic iw and exp iw. Table 4 shows that for the parameter \(\textbf{R}\), chaotic iw performs the best for G2 and G6, whereas exp iw is best for G7 and G13. For G11, the random nature of the parameter \(\textbf{R}\) outperforms the other operators. Overall, the parameter \(\textbf{R}\) with random iw is the best.

The last parameters l and g are analysed using linearly decreasing linear iw, oscillating iw and exp iw. Table 4 that the parameter l with exp iw yields the better results for all problems. Hence, exp iw is the best strategy for the parameter l. Further, the parameter g gives the best results with linear reduction between [2, 0]. So, here linear iw of g is the best strategy among all the operators.

Table 4 Parameter analysis of DHPN algorithm

Effect of population size

For pop size comparison, DHPN is compared with MFO, GWO, WOA, MPA and NMRA for four pop sizes, including 25, 50, 75 and 100. Total runs and generations are set to 51 and 500, respectively. Here, 7 UM functions (G1–Total runs and generations7) and 5 MM functions (G8–G12) from Table 2 are used for the analysis of various algorithms for different pop sizes as given in Table 5.

Population size 25: Here, for G3, G4, G7 and G9, DHPN algorithm yields the superior performance for both mean and std values. For G1 and G2, WOA performs the best, however, DHPN algorithm is quite competitive too. For G5, there is a slight variation in the mean values of different algorithms. NMRA is the best for G5. For functions G6, G11 and G12, MPA gives the best performance. For G8, both MPA and DHPN are capable of yielding the optimum value. For G10, the DHPN algorithm, along with MPA and NMRA, is best. Thus, the performance of the DHPN algorithm is best for six problems, MPA is superior for five, and WOA and NMRA for two test problems each. So, DHPN algorithm is found to be the most superior of all the other algorithms for 25 pop size.

Population size 50: Here, for G1, G2, G3, G4, G7 and G9, DHPN algorithm outperforms all the other existing algorithms under comparison. For G5, NMRA is better. For G6, G11 and G12, MPA is found better. However, for G8 and G10, MPA, NMRA, and DHPN are capable of giving the optimum values. Hence, the DHPN algorithm gives the best performance for seven, MPA is better for five, and NMRA gives the best result for three. Overall, DHPN is superior for pop size 50.

Population size 75: Here DHPN algorithm is better for G1, G3, G7 and G9. WOA is found to be the best for G2, but the performance of DHPN algorithm is also competitive. NMRA gives the best results for G4. For G5, NMRA is better among all. For G6, G11 and G12, MPA outperforms others. For G8, both MPA and DHPN algorithm yield the optimum value. For G10, GWO, MPA, NMRA and DHPN algorithms are better. Hence, the DHPN algorithm is best for six test problems, MPA is best for five, NMRA is best for three, and both WOA and GWO give the best performance for one function only.

Population size 100: Here, DHPN outperforms other algorithms for G1, G3, G7 and G9. For G2, WOA yields better results, but DHPN results are still competitive. For G4 and G5, NMRA outperforms the other algorithms. For G6, G11 and G12, MPA gives the best results. For G8, both MPA and DHPN outperform the other algorithms under comparison. All the algorithms except WOA give the exact optimum value for the G10. Therefore, DHPN is superior for six, MPA for five, and NMRA for three, whereas WOA, MFO and GWO perform best for one function each. Therefore, DHPN is best for pop size 100 too.

Inferences: Table 5 clearly depicts that the performance of the DHPN algorithm decreases for lesser values of pop size. Further, bigger values do not contribute significantly, but add to the computational burden. This is because, to find a solution for any problem, the required evaluations are a multiple of the total population. With increasing pop size, required function evaluations also increase, thus, resulting in enhanced overall computational complexity. Here, it can be seen that a population of 50 individuals is capable of providing good results without increasing the computational burden of DHPN. Therefore, for simulation results, the pop size is set to 50.

Table 5 Experimental results for population size of 25, 50, 75, 100.

Effect of dimension size

This subsection describes the dimension size effect on DHPN algorithm with MFO, GWO, MPA, WOA, and NMRA. Here, 7 UM functions (G1–G7) and 5 MM functions (G8–G12) from Table 2 are used for analysis. Total runs and iterations is 51 and 500, respectively. To study the effect of dimension size, four dimension sizes such as 10, 30, 50 and 100 are chosen. The results obtained for different dimension sizes are given in Table 6.

Dimension size 10: Here, DHPN algorithm is better for G1, G4, G7, G2, G3, G8 and G9. For G5, NMRA gives superior results in terms of std values. For G6, G11, and G12, MPA outperforms others. GWO, MPA, NMRA and the proposed DHPN algorithm are capable of giving the exact global minima for G10. Overall, the proposed DHPN algorithm gives the best solutions for eight, MPA for five, NMRA for two, and GWO for one function only.

Dimension size 30: Here, DHPN algorithm outperforms others for G1, G2, G3, G4, G7 and G9. For G5, NMRA performs betters; for G6, G11 and G12, MPA yields the better solutions. MPA, NMRA and the proposed DHPN algorithm give the exact global minima for G8 and G10. Overall, DHPN performs better for eight, MPA for five and NMRA for two test problems.

Dimension size 50: Here, DHPN is the best one for G1, G2, G3, G4, G7 and G9. Further, for G5, NMRA gives the best for std values. MPA gives better performance for G6, G11 and G12. For G8 and G10, MPA, NMRA, and the proposed DHPN algorithm give improved performance. Here, the DHPN algorithm is better for eight, MPA is better for five, and NMRA is better for two functions.

Dimension size 100: Here, the proposed DHPN algorithm yields the best results for G1, G2, G3, G4, G6, G7 and G9 for both mean as well as std values. For G5, NMRA is found to be the best. For G8, WOA, NMRA, MPA, and DHPN are capable of yielding the optimum value. For G10, all the algorithms except MFO and WOA give the desired global optimum value. For G11 and G12, WOA and the proposed DHPN algorithm give better results, respectively. Hence, the proposed DHPN is better for ten problems, NMRA for three, WOA and MPA for two, and GWO for one function only. In the nutshell, the proposed DHPN is best for 100 dimension size.

Inference: Simulation results clearly indicate that the DHPN algorithm is the best overall strategy for all the dimension sizes. For the rest of the simulation results, the value of dimension size is set to 30.

Table 6 Experimental results for dimension size of 10, 30, 50, 100.

Comparison for CEC 2005 benchmarks

Here, DHPN algorithm is evaluated against the existing optimization algorithms such as JADE, OEWOA, FA-FPO, SCCSA, GWO-E, SaDE, HBA, LSHADE-SPACMA, CMA-ES, SHADE, DMO, and CS for CEC 2005 benchmark functions48. The parameters of the aforementioned algorithms are shown in Table 3. Here, a dimension size of 30 and a number of runs of 51 is taken.

Experimental studies

Comparative results in Table 7 clearly depict that for G1, G2, G3, G4, G7 and G9, only the proposed DHPN algorithm yields better solutions while the other algorithms may result in sub-optimal solutions. For G5, G11 and G12 JADE is better; and for G6, CMS-ES outperforms others. For G8, GWO-E, OEWOA, FA-FPO, HBA and DHPN algorithms are capable of providing optimum solution. Further, for G10, GWO-E, FA-FPO, DMO, HBA and DHPN algorithms yield optimal solution. For G13, little changes in the mean can be seen and on comparing the std values, DMO gives the best results. Similarly, for G14, a slight variation in the mean values of different algorithms is observed. So, on the basis of std, JADE is the best. Further, for G15, OEWOA is found to be the best. Hence, among fifteen functions, the proposed DHPN algorithm is best for eight, JADE for four, OEWOA, GWO-E, FA-FPO, DMO and HBA for two, whereas CMA-ES for one function only. So, overall, the proposed DHPN algorithm outperforms others.

Statistical testing

To validate the simulation results, two non-parametric tests namely Wilcoxon’s p-rank80 and Friedman f-rank tests79 are employed. The p-rank is used in assigning the p-values to two algorithms under test and is given in Table 7 for every problem as win(w)/loss(l)/tie(t). Here \(win(w) = '+'\) situation, test algorithm yields better performance than the proposed; \(loss(l) = '-'\) situation arises when the performance of the test algorithm is inferior to the proposed; and lastly, the tie(t) scenario arises when both the considered algorithms are either significant or irrelevant and hence, assigned ”=” is assigned to it. So, from w/l/t in Table 7, we find that the DHPN algorithm is better than existing algorithms. This test is also applied to validate the superiority of the proposed DHPN algorithm. In this test, each algorithm is given a unique rank and is shown in Table 7. It has been found that the DHPN algorithm is comparatively better and has the \(1^{st}\) rank among all others.

Table 7 Simulation results for CEC 2005 benchmark functions

Comparison on CEC 2019 benchmark functions

As a further extension, the competitiveness of DHPN algorithm is proven by testing it on ten CEC 2019 benchmarks71 of the 100-Digit Challenge81. Table 8 list the names, dimensions, search ranges and optimal values of CEC 2019 benchmark functions \((G16-G25)\). For computing simulation results on CEC 2019 functions, the values of pop size, runs and maximum iterations are taken as 50, 51 and 500, respectively. Further, existing algorithms jDE100, DE, PSO, YDSE, NMRA and PDO are compared with the proposed DHPN algorithm as mean and std values in Table 9.

Table 8 CEC 2019 Benchmark Functions.

Table 9 shows that the proposed DHPN is best for functions G21, G22 and G25; for G16 and G18, PDO is better; and for G17, YDSE yields the best performance. PSO yields best results for G19, G20, G23 and G24. Here also, Wilcoxon’s p-rank and f-rank tests are done to test the significance of the DHPN algorithm. From Table 9, we find that the DHPN algorithm has 10 wins and no loss against jDE100. As compared to both DE and PDO, DHPN has 8 wins and 2 losses with no draw. Similarly, DHPN has a total of 6 wins and 4 losses with no draw against PSO and YDSE. Moreover, according to the f-rank test, DHPN stands at \(1^{st}\) among all comparative algorithms. Thus, DHPN is deduced to be the best overall strategy for the CEC 2019 benchmark suite.

Table 9 Statistical results for 100-digit challenge (CEC 2019) numerical problems.

Convergence profiles

The convergence profiles of YDSE, DE, NMRA and the proposed DHPN algorithm are shown for functions G16, G17, G20, G21, G23 and G24 in Fig. 2. Here, convergence curves are drawn for maximum iterations, pop size and dimension as 500, 50 and 30, respectively. The convergence profiles for G16, G17, G21, G23 and G25 clearly confirm the fastest convergence of the DHPN algorithm among the other algorithms under comparison. For G20, the proposed DHPN algorithm begins to converge faster initially, but its rate of convergence decreases later. So, for G20, the convergence of DE is the fastest. Overall, the DHPN algorithm is superior to the existing algorithms in terms of convergence.

Figure 2
figure 2

Convergence profiles of YDSE, DE, NMRA, and DHPN.

Real world application: region-based image fusion using DHPN

The demand for image fusion in image processing applications has grown significantly with the tremendous increase in acquisition systems45,46. Image fusion is a process of integrating images acquired by various sensors to yield an informative image. An effective image fusion preserves vital information by extracting all essential properties and features from the images without resulting in any inconsistency in the fused image. This fused image is more apt for both human as well as machine perception as it gives information that cannot be obtained by examining several images individually. In certain applications, details from CCD and IR sensors are combined together with the objective of maintaining their distinct features82. The process of merging information obtained from several sensors is called MM image fusion83,84. An example of an image pair obtained from different sensors with their fused image is shown in Fig. 3. The final image clearly maintains the distinct features of both VIS and IR images. Human presence is depicted, and edges are preserved.

Figure 3
figure 3

Example of an image pair obtained by different sensors and their fusion (a) VIS image (b) IR image and (c) Fused image.

Over the last three decades, several image fusion techniques have been developed to improve image quality and make them appropriate for different applications such as remote sensing, target detection, surveillance, biological recognition, medical images, military field, computer vision, and robotics85,86,87.

Related work

Image fusion

The image fusion techniques can be categorized into pixel level, decision level and feature level. Pixel-level image fusion techniques integrate the meaningful data from source images at the lowest physical level directly88. Feature-level image fusion techniques involve the extraction of relevant features like textures, edges or pixel intensities, which are then combined to form merged images53,89. Decision-level image fusion is the most complex and highest-level fusion. In this, decisions made from initial object detection and classification tasks are merged90. As compared to feature level and decision level techniques, pixel-level image fusion provides the maximum amount of details and is more suitable for applications that are dependent on different sensors91.

The pixel-level techniques can be classified as substitution techniques and transform domain techniques. Substitution methods include Independent Component Analysis (ICA)92, Principal Component Analysis (PCA)93, whereas transform domain techniques include Pyramid Transform (PT)94, Discrete Wavelet Transform (DWT)95, shift-variant shearlet transform96, curvelet54, and contourlet transform97. Although the aforementioned methods have less computation cost, however, they possess noise and are unable to preserve adequate details of source images98.

Visible and infrared image fusion

IR images are more robust to varying light levels and environmental deterioration. On the contrary, VIS images have finer visual quality with precise background information like texture and higher pixel pitch. The objective of fusing VIS and IR images is to derive significant details from both images and incorporate them in the compound image47. Hence, over the past two decades, VIS and IR image fusion has become an intensive area of research in multi-band nighttime monitoring and navigation imagery.

The most widely used methods for fusing VIS and IR images are region-based techniques99,100. Region-based image fusion techniques include the segmentation of source images into a number of segments or clusters on the basis of region-based algorithms and merging them region by region92,99,101. The most significant stage of such techniques is the extraction of salient details from IR images. The present work employs a novel optimization strategy for region-based image fusion of VIS and IR images.

In the proposed region-based technique, DHPN is used to segment the VIS image to derive its salient features. The extracted features are used to calculate the weight map function of both VIS and IR images. These weight map functions are further used to form the final fused image.

Multi-level thresholding for image segmentation

Image segmentation is a pivotal pre-processing task in image processing. An effective, simple way to achieve image segmentation is through image thresholding. It is capable of classifying pixels into different groups through a set of threshold levels55,102. The aim of thresholding is to compute the value of optimal threshold levels for differentiating the target from its background103. If a method chooses only two optimum threshold levels, then it is said to be a bi-level or two-level thresholding method. On the other hand, in a multi-level thresholding, an image is segmented on the basis of multiple optimum threshold levels.

The widely used thresholding techniques are mainly categorized as parametric and non-parametric techniques. Parametric techniques are computationally expensive in terms of both cost and time, as these methods require the estimation of Probability Density Functions (PDFs). On the other hand, non-parametric techniques yield results with better accuracy and less computational cost104. Such techniques calculate threshold values by measurement of entropy105, error rate, between-class variance106, local maxima, and so on. The most preferred method for thresholding-based image segmentation is the OTSU method106.

Image segmentation using OTSU method

Also known as the Between-class variance method, the OTSU method is an adaptive binarization thresholding algorithm that was proposed by Japanese scholar OTSU in 1979. This non-parametric, unsupervised method utilizes between-class variance for separating the segmented classes. It obtains a threshold value that maximizes inter-class variance, resulting in minimum intra-class variance between pixel intensities of each class. Thus, this method divides a given image into background and foreground depending on its gray-scale characteristics, such that the difference between them is the largest with the best threshold.

The fundamental principle of image segmentation using OTSU method is described below.

Consider the gray scale range of an image is \(k=0,1,2,...,K-1\). If \(m_i\) signifies the amount of pixels with ith gray level, then the total pixels M in the image are

$$\begin{aligned} M=\sum _{i=0}^{K-1}m_i=m_0+m_1+m_2+...+m_{K-1}. \end{aligned}$$
(59)

The likelihood of occurrence of ith gray level is given by

$$\begin{aligned} p_i=\frac{m_i}{M}=\frac{m_i}{\sum _{i=0}^{K-1}m_i}. \end{aligned}$$
(60)

Assuming T is the gray level threshold which divides the image into two classes: \(Y_0=\{0,1,...,T\}\) and \(Y_1=\{T+1,T+2,...,K-1\}\), the probability of both \(Y_0\) and \(Y_1\) are

$$\begin{aligned} \Omega _0=\sum _{i=0}^{T}p_i \end{aligned}$$
(61)

and

$$\begin{aligned} \Omega _1=\sum _{i=T+1}^{K-1}p_i, \end{aligned}$$
(62)

The means and variances of both \(Y_0\) and \(Y_1\) are

$$\begin{aligned}{} & {} \mu _0=\sum _{i=0}^{T}\frac{ip_i}{\Omega _0}, \end{aligned}$$
(63)
$$\begin{aligned}{} & {} \mu _1=\sum _{i=T+1}^{K-1}\frac{ip_i}{\Omega _1}, \end{aligned}$$
(64)
$$\begin{aligned}{} & {} \sigma _0^2=\sum _{i=0}^{T}(i-\mu _0)^2p_i/\Omega _0 \end{aligned}$$
(65)

and

$$\begin{aligned} \sigma _1^2=\sum _{i=T+1}^{K-1}(i-\mu _1)^2p_i/\Omega _1, \end{aligned}$$
(66)

If \(\mu _T\) denotes mean intensity of the image, then

$$\begin{aligned} \Omega _0 \mu _0+\Omega _1 \mu _1=\mu _T \end{aligned}$$
(67)

and

$$\begin{aligned} \Omega _0+\Omega _1=1. \end{aligned}$$
(68)

The between-class variance and inter-class variance are defined as

$$\begin{aligned} \sigma _B^2=\Omega _0(\mu _0-\mu _T)^2+\Omega _1(\mu _1-\mu _T)^2. \end{aligned}$$
(69)

and

$$\begin{aligned} \sigma _I^2=\Omega _0\sigma _0^2+\Omega _1\sigma _1^2, \end{aligned}$$
(70)

Hence, the total variance is

$$\begin{aligned} \sigma _T^2=\sigma _I^2+\sigma _B^2 \end{aligned}$$
(71)

To perform thresholding, the OTSU method maximizes the between-class variance, which minimizes the inter-class variance as the total variance is constant. This method can be extended to multi-level thresholding by assuming \(N-1\) thresholds \(\{T_1,T_2,...,T_{N-1}\}\) that segment the total pixels into N different classes \(\{Y_0,Y_1,...,Y_{N-1}\}\). The thresholds can be calculated by

$$\begin{aligned} \begin{aligned} \{T_1,T_2,...,T_{N-1}\}&=\arg \{\max _{0\le T\le K-1}\{ \sigma _B^2 (T_1,T_2,...,T_{N-1})\}\}\\&=\arg \{\min _{0\le T\le K-1}\{ \sigma _I^2 (T_1,T_2,...,T_{N-1})\}\} \end{aligned} \end{aligned}$$
(72)

where

$$\begin{aligned}{} & {} \Omega _j=\sum _{i=T_{j-1}+1}^{T_j}p_i, \end{aligned}$$
(73)
$$\begin{aligned}{} & {} \mu _j=\sum _{i=T_{j-1}+1}^{T_j}ip_i/\Omega _j, \end{aligned}$$
(74)
$$\begin{aligned}{} & {} \sigma _j^2=\sum _{i=T_{j-1}+1}^{T_j}(i-\mu _j)^2p_i/\Omega _j, \end{aligned}$$
(75)
$$\begin{aligned}{} & {} \sigma _B^2=\sum _{j=1}^N\Omega _j(\mu _j-\mu _T)^2 \end{aligned}$$
(76)

and

$$\begin{aligned} \sigma _I^2=\sum _{j=1}^N\Omega _j\sigma _j^2. \end{aligned}$$
(77)

Weight map computation and fusion strategy

The proposed DHPN algorithm calculates the threshold values of an input image to obtain the segmented image along with its weight map. Assuming the input image to be an IR image, then the corresponding segmented image and its weight map are given by,

$$\begin{aligned} I_{SI}(a,b)=\text {DHPN}(I_{IR}(a,b)) \end{aligned}$$
(78)

and

$$\begin{aligned} W_{IR}^1(a,b)=I_{SI}(a,b), \end{aligned}$$
(79)

The weight map of VIS image corresponding to the input IR image is

$$\begin{aligned} W_{VIS}^1(a,b)=\max ( W_{IR}^1(a,b))- W_{IR}^1(a,b). \end{aligned}$$
(80)

The weight maps \(W_{IR}^1\) and \(W_{VIS}^1\) are hard and noisy. Thus, they are unsuitable for the fusion of input images. In particular, IR images include coarse-scale structural data. While VIS images depict fine-scale structures. Fusing them directly may result in more insignificant details from IR images and fine-scale details from VIS images. To remove these artefacts in the final fused image, the WLS107 optimization scheme is used to refine both weight maps.

WLS preserves edges by striking a balance between sharpening and blurring. Hence, it is an edge-preserving and smoothing filter that progressively sharpens an image whilst preserving its spacial information107. For weight maps \(W_{IR}^1\) and \(W_{VIS}^1\), the corresponding refined weights \(W_{IR}^2\) and \(W_{VIS}^2\) can be found by minimizing functions \(F_{IR}\) and \(F_{VIS}\), respectively.

$$\begin{aligned}{} & {} F_{IR}=\sum _p \left( \left( W_{(IR)p}^2- W_{(IR)p}^1\right) ^2+\lambda \left( W_{a,p}W_{IR}^1\left( \frac{\partial W_{IR}^2}{\partial a}\right) _p^2+W_{b,p}W_{IR}^1\left( \frac{\partial W_{IR}^2}{\partial a}\right) _p^2\right) \right) \end{aligned}$$
(81)
$$\begin{aligned}{} & {} F_{VIS}=\sum _p \left( \left( W_{(VIS)p}^2- W_{(VIS)p}^1\right) ^2+\lambda \left( W_{a,p}W_{VIS}^1\left( \frac{\partial W_{VIS}^2}{\partial a}\right) _p^2+W_{b,p}W_{VIS}^1\left( \frac{\partial W_{VIS}^2}{\partial a}\right) _p^2\right) \right) \end{aligned}$$
(82)

After obtaining refined weight maps \(W_{IR}^2\) and \(W_{VIS}^2\), the pixel-wise single-scale weighted average composition is carried out to get the fused image, \(I_{FI}\) as

$$\begin{aligned} I_{FI}(a,b)=W_{IR}^2(a,b)\times I_{IR}(a,b)+W_{VIS}^2(a,b)\times I_V(a,b) \end{aligned}$$
(83)

Experimental results

This section demonstrates the effectiveness of the DHPN algorithm in the context of pixel-based image fusion. The DHPN algorithm is analysed against the existing image fusion algorithms based on discrete cosine harmonic wavelet (DCH)108, cross bilateral filter (CBF)109, gradient transfer method (GTF)110. Saliency detection in sparse domain (JSR)111, convolutional sparse representation (CSR)112 and deep convolutional neural network (CNN)113. For simulations, twelve pairs of visible and infrared images are chosen from TNO data set114 which is available online at https://doi.org/10.6084/m9.figshare.1008029.v1. Further, comparative analysis has been done on the basis of four fusion metrics, namely edge-based similarity index (\(Q^{AB/F}\)), the sum of correlation difference (SCD), structural similarity index measure (SSIM) and artefact measure (\(N^{AB/F}\)), which are discussed in the subsequent subsections. Also, their MATLAB codes are available in the public domain.

Edge based similarity index (\(Q^{AB/F}\))

This index represents the edge preservation detail in the fused images with a range between (0, 1). It can be calculated as115,116

$$\begin{aligned}{} & {} Q^{AB/F}=\frac{\sum _{u=1}^U\sum _{v=1}^V Q^{A/F}(u,v)q_A(u,v)+Q^{B/F}(u,v)q_B(u,v)}{\sum _{u=1}^U\sum _{v=1}^V (u,v)q_A(u,v)+(u,v)q_B(u,v)}, \end{aligned}$$
(84)
$$\begin{aligned}{} & {} Q^{A/F}(u,v)=Q_q^{A/F}(u,v) Q_o^{A/F}(u,v), \end{aligned}$$
(85)
$$\begin{aligned}{} & {} Q^{B/F}(u,v)=Q_q^{B/F}(u,v) Q_o^{B/F}(u,v), \end{aligned}$$
(86)

where F represents the fused image of source images A and B; \(q_A(u,v)\) and \(q_B(u,v)\) denote weights for images A and B at pixel position (uv), respectively. The parameters \(Q_o^{A/F}\) and \(Q_o^{B/F}\) represent preservation of orientation values, whereas the parameters \(Q_q^{A/F}\) and \(Q_q^{B/F}\) denote strength of edge in images A and B at pixel location (uv), respectively.

The \(Q^{AB/F}\) values of different algorithms are given in Table 10. Ideally, the value of \(Q^{AB/F}\) should be one, indicating that all the edge details are preserved. It is clearly evident that the \(Q^{AB/F}\) value obtained in the case of the proposed DHPN algorithm is 0.765508, which is higher than that of the other algorithms for all twelve pairs. This signifies that the proposed algorithm preserves edge details in the fused images quite significantly with respect to the existing algorithms.

Table 10 Comparison of various algorithms on the basis of \(Q^{AB/F}\).

Sum of correlation difference (SCD)

SCD denotes the extent of useful information that is transmitted to the fused images from its corresponding source images99,117. For good fusion results, a higher value of SCD is desirable. It is computed as

$$\begin{aligned} SCD=corr(diff_1,A)+corr(diff_2,B), \end{aligned}$$
(87)

where \(diff_1=F-B\), \(diff_2=F-A\) and corr(.) represents correlation function.

The SCD values for various algorithms are given in Table 11. The results show the superiority of the DHPN algorithm over other algorithms for five image sets in terms of SCD values. On the contrary, CSR is best for four image sets, JSR is best for two image sets, and DCH yields a higher value of SCD for one image set only. Overall, the proposed DHPN algorithm gives a higher average value of SCD, which is desirable. This shows that with DHPN, more meaningful data is transmitted into the final fused image than with the existing algorithms.

Table 11 Comparison of various algorithms on the basis of SCD.

Structural similarity index measure (SSIM)

As the name suggests, SSIM represents the extent of similarity between two images with a range between (0, 1). It is preferable when the ground truth is available. However, in the case of image fusion, SSIM is calculated as

$$\begin{aligned} SSIM(A,F)=\frac{(2\alpha _A\alpha _F+C_1)(2\sigma _{A/F}+C_2)}{(\alpha _A^2+\alpha _F^2+C_1)(\sigma _A^2+\sigma _F^2+C_2)}, \end{aligned}$$
(88)
$$\begin{aligned} SSIM(B,F)=\frac{(2\alpha _B\alpha _F+C_1)(2\sigma _{B/F}+C_2)}{(\alpha _B^2+\alpha _F^2+C_1)(\sigma _B^2+\sigma _F^2+C_2)} \end{aligned}$$
(89)

and

$$\begin{aligned} SSIM= \frac{SSIM(A,F)+SSIM(B,F)}{2}. \end{aligned}$$
(90)

Here, \(\alpha _A\), \(\alpha _B\) and \(\alpha _F\) denote mean intensities, whereas \(\sigma _A^2\), \(\sigma _B^2\) and \(\sigma _F^2\) represent variance of images A, B, and F, respectively. The parameters \(\sigma _{A/F}\) and \(\sigma _{B/F}\) signify covariance of source images A and B, and fused image F, respectively. \(C_1\) and \(C_2\) denote constants.

Table 12 gives SSIM values for twelve pairs of images using various algorithms. Higher the value of SSIM, higher is the similarity of constituent images with the fused image. It is quite evident from the results that out of twelve images, CSR performs best for three image sets, whereas DHPN outperforms other existing image fusion algorithms for nine image sets. Overall, DHPN gives the highest value 0.726317 of SSIM among all the existing. Hence, DHPN performs superiorly in terms of average SSIM value.

Table 12 Comparison of various algorithms on the basis of SSIM.

Artefact measure (\(N^{AB/F}\))

The process of image fusion may add some meaningless visual details to the final fused image. These artefacts are undesirable and could lead to misinformation which can severely affect fusion applications. The measure \(N^{AB/F}\)118,119 gives the extent of such distortion or noise introduced in the final merged image, and hence its lower value is preferable.

The values of \(N^{AB/F}\) for different image fusion methods are given in Table 13. It is evident that the value of \(N^{AB/F}\) is minimum for eleven image sets in the case of the proposed DHPN algorithm, whereas the CSR algorithm yields the lowest value for only one image set, ”Camp”. Overall, DHPN gives a minimum average \(N^{AB/F}\) value, i.e. 0.006617, as compared to the existing algorithms. This signifies that DHPN introduces the least amount of artefacts in the output image of all the other image fusion algorithms.

Table 13 Comparison of various algorithms on the basis of \(N^{AB/F}\).

Furthermore, fused images of various image sets using the DHPN algorithm are given in Fig. 4. In image sets ”Camp”, ”Traffic”, ”Building”, ”Home post”, ”Bench”, ”Doorway”, ”Soldier” and ”Trench”, a human figure is visible only in the corresponding IR images, whereas background details like buildings, foliage, trees can be seen in the corresponding VIS images. DHPN combines both of these features in the corresponding fused images, thus, increasing the information content. For image sets ”Bunker”, ”Heather”, ”Light hut” and ”Lake”, the objective is to insert spectral details present in the IR images into the corresponding VIS images for night-vision context enhancement. It can be seen that the corresponding fused images show better context enhancement and have details from both the VIS and IR images. In a nutshell, DHPN preserves background details from VIS images and object details from IR images in the corresponding final fused images. Thus, DHPN is capable of transferring necessary details from VIS as well as IR images into the fused images. This is also confirmed by the comparative analysis based on different fusion quality metrics.

Figure 4
figure 4figure 4figure 4

Image fusion results using proposed DHPN algorithm.

Discussion

This section has three subsections, namely summary, which highlights the major details of the proposed algorithm, its application to benchmark and image fusion. The second subsection provides the drawbacks of the proposed approach, and in the final subsection, we provide some insightful implications.

Summary of results

  • The proposed DHPN consists of a balanced expl phase by incorporating DMO, HBA, and PDO into NMRA. A stagnation phase is incorporated for local optima avoidance, and finally, six new iws (simulated annealing, exponential, linear, chaotic, oscillating, and random) are added to make it self-adaptive.

  • In order to evaluate the performance, a pop size and dimension size comparison is done. To check for the best set of parameters, an analysis of parameters is also presented. It has been found that if the pop size is reduced, the algorithm diversity reduces, and hence, the results are not that significant. However, when we increase the pop size, the variation in results is very limited. So it can be said that a pop size of 50 can be considered the best. The same is true for dimension size, where the results do degrade over higher dimensions, but there are no significant changes in results. Hence, making the algorithm a better fit for higher dimensions as well.

  • These operators have been applied to all the parameters of DHPN to make it adaptive in nature. These parameter adaptations help the algorithm in self-tuning, with no requirement for adaptation from the user perspective.

  • All of these added advantages are tested over CEC 2005 as well as CEC 2019 benchmarks. It has been found that DHPN provides good optimal results in comparison to other algorithms in the literature.

  • For CEC 2005, the proposed algorithm scores the first rank, HBA has the second rank, and JADE ranks third. For CEC 2019, DHPN was also ranked first, and YDSE was the second-best algorithm.

  • Further, DHPN is applied to the real-world problem of region-based image fusion of VIS and IR images. The average \(Q^{AB/F} = 0.765508\), \(SCD = 1.63185\), \(SSIM = 0.726317\), and \(N^{AB/F} = 0.006617\) shows the best combination of results obtained by DHPN with respect to other algorithms such as DCH, CBF, GTF, JSR and others. Experimental tests showed that the proposed algorithm provides reliable results.

Drawbacks

  • A critical issue with the algorithm is the use of a highly intensive expl combined with expt operation in the worker phase. Thus making the algorithm more of an exploitative one, and hence increasing the chances of poor expl.

  • It’s true that the stochastic nature of algorithms can lead to getting stuck in local optima, and there’s no guarantee of reaching a global solution. This limitation applies to many stochastic optimization algorithms, including the proposed DHPN algorithm. While DHPN may perform well for certain types of problems, it may not be suitable for others, where finding the global optimum is crucial. It’s important to carefully consider the characteristics of the problem at hand and assess whether the stochastic nature of DHPN aligns with the problem’s requirements. Additionally, employing strategies such as hybridization or incorporating problem-specific knowledge may help mitigate this limitation and improve the algorithm’s performance across a wider range of problems.

  • A balance between expl versus expt is also not guaranteed in the proposed algorithm. This is because of the presence of a more enhanced expl operation rather than expt operation. Thus, an extensive experimental study is required on how we can achieve this effectively.

Insightful implications

  • The proposed DHPN is easier to implement and can be applied to hybrid and expert intelligent systems. We can further enhance its expl and expt operations and can make it the best fit for real-world problems.

  • The algorithm can be changed to a binary version for use in medical imaging, wireless sensor networks, electroencephalogram, clustering and other binary domain problems.

  • DHPN can also be extended to multiple engineering design problems, including multi-objective optimization problems for solving problems with two to many objectives.

  • Apart from these, the algorithm can be explored more to make all its parameters auto-tunable, whereby auto-tunable means that with the progression of the iterations, the algorithm will automatically tune its parameters to fit the problem under consideration.

Conclusion and future work

This paper presents an extensive study of a new multiple hybrid algorithm formulated using four algorithms, including DMOA, HBO, PDO, and NMRA. The algorithm has a new stagnation phase and self-adaptive mutation operators for local optima avoidance and parametric enhancements. The DHPN algorithm is checked for CEC 2005 and CEC 2019 problems for performance evaluation. A deeper analysis of the population and dimension size shows that the proposed algorithm fares significantly better for medium pop sizes and can provide good results for higher dimensional problems. A comparative study with JADE, OEWOA, SaDE, FA-FPO, GWO-E, SHADE, SCCSA, CMA-ES, LSHADE-SPACMA, and jDE100, among others, shows that the proposed DHPN provides highly reliable results. Even from the statistical results of Wilcoxon and Friedman tests, DHPN has been found to provide better results. Furthermore, DHPN is used as an optimization strategy in region-based image fusion. The efficacy and efficiency of DHPN in image fusion are proven with respect to the four fusion quality metrics.

However, the proposed algorithm has the extensive expl operation, but suffers from the poor expt operation. A balance between both operations is desirable. For future results, a balance between the expl and expt operation must be performed. For parametric adaptations, we can explore various other mutation/inertia wight operations, including trigonometric, Gaussian, Cauchy and others. New mathematical equations can be added to further improve the expl and expt operations. A detailed theoretical and numerical study can be added to see how the algorithm behaves theoretically. Further, as an extension, DHPN can be applied to gene expression modelling, cancer classification, feature selection, and clustering problems, among others.

Some other future works include the exploration of hybrid approaches, particularly integrating DHPN with other optimization techniques and domain-specific algorithms. Advanced versions of DHPN can be designed for handling multi-objective and many-objective optimization problems, adapting to dynamic environments, and scaling for large-scale and parallel optimization. Additionally, the intersection of DHPN with quantum computing is also expected to lead to breakthroughs in evolutionary quantum algorithms, revolutionizing optimization capabilities. Overall, the future of DHPN is characterized by continued innovation, driving its broader adoption and impact across diverse domains.