Assume that \({I_1}\) and \({I_2}\) show source images. Initially, NSCT is used to decompose \({I_1}\) and \({I_2}\) into sub-bands. Our primary objective is to fuse the respective sub-bands of both source images. The fusion of high sub-band is achieved by using the extreme version of the Inception (Xception) model. The coefficient of determination is utilized to evaluate the significance of the computed fused high sub-bands. Low sub-bands fusion is achieved by using the local energy function. Inverse NSCT is utilized to compute the multi-modality fused image. Figure 1 shows the step by step methodology of the proposed model.
Nonsubsampled contourlet transform
Nonsubsampled Contourlet Transform (NSCT) is a well-known transform used to decompose the images into the wavelet domain. It is a shift-invariant transform which can provide rich directional details. This directionality is effective to convert the transformed images to the actual one with minimum root mean square error (for more details please see Da Cunha et al. 2006).
Feature extraction using deep Xception model
CNN may suffer from the under-fitting issue, as many potential features may not be extracted. To overcome this issue, an extreme version of the Inception (Xception) model is used. Figure 2 represents the block diagram of the xception model (for mathematical and other information please see Chollet 2017).
Both high sub-bands of source images are placed in parallel fashion in the Xception model. Consider \(\eta {I_1}(p,q)\) and \(\eta {I_2}(p,q)\) are the obtained features from respective high sub-bands by using the Xception model.
Feature selection using multi-objective differential evolution
In this step, the optimal features are selected from the features obtained from the Xception model. The fusion factor and entropy metrics are used as the fitness function to select the optimal features. A multi-objective differential evolution can solve many computationally complex problems (Babu et al. 2005). It can significantly balance the fast convergence and population diversity. It can be described in the following steps:
I. Initialization: First of all, various parameters related to differential evolution are defined such as population size (\(t_p\)), crossover rate (\(c_r\)), mutation rate (\(m_r\)), etc. Random distribution is used to generate the random solutions \(\beta _\alpha ^0 (\alpha = 1, 2, \ldots ,t_p)\). h defines the number of function evaluations. It is used to control the iterative process of differential optimization with maximum function evaluations (\({h}_{M}\)).
II. Iterative step Mutation and crossover operators are used to obtain the optimal number of features.
Mutation is implemented on a \(\beta _\alpha ^{h}\) to evaluate a child vector \(\Pi _\alpha ^{h}\). In this paper, following mutation is used:
$$\begin{aligned} \Pi _\alpha ^{h} =\beta ^{h} _{{d1}}+m_r \cdot (\beta ^{h}_{d2}-\beta ^{h}_{d3}) \end{aligned}$$
(1)
Here, \(\alpha \) shows index values. \({d}_i\) \(\ne \) \(\alpha \) \(\forall i=1:3\). \({d}_1\), \({d}_2\), and \({d}_3\) are random numbers selected from \([1,\;t_p,]\).
Crossover is used to obtain the news solutions. A child \(\epsilon _\alpha ^{h}\) can be obtained from \(\forall \) \(\beta _\alpha ^{h}\), as:
$$\begin{aligned} \epsilon _{\alpha _\kappa }^{h} ={\left\{ \begin{array}{ll} \Pi _{\alpha _\kappa }^{h}, \quad \beta _{\kappa } \le c_r \quad \text {or}\quad \kappa =\kappa _{{d_n}} \\ \beta _{\alpha _\kappa }^{h}, \quad otherwise \end{array}\right. } \kappa =1,2,\dots ,{D}, \end{aligned}$$
(2)
where D shows dimensions of the problem. \(\beta _{\kappa }\) \( \in \; [0,\; 1]\) and \(\kappa _{{d_n}}\) \(\in \; [1,\; {D}]\).
III. Selection: A child vector \(\epsilon _\alpha ^{h}\) can be drawn by considering its parent vector \(\beta _\alpha ^{h}\) as:
$$\begin{aligned} \beta _\alpha ^{{h}+1}={\left\{ \begin{array}{ll} \epsilon _\alpha ^{h}, \quad f(\epsilon _\alpha ^{h}) \le f (\beta _\alpha ^{h}) \\ \beta _\alpha ^{h}, \quad otherwise \end{array}\right. } \end{aligned}$$
(3)
IV. Stopping condition: If number of functional evaluations are lesser than the total available evaluation then Steps II and III will be repeated.
Fusion of high sub-bands
The extracted and selected features using the Xception model from high sub-bands are then fused by using the coefficient of determination (R). R between \(\eta {I_1}(p,q)\) and \(\eta {I_2}(p,q)\) can be computed as:
$$\begin{aligned}&R_{N}(\eta {I_1}, \eta {I_2}) \nonumber \\&\quad = \frac{\Big (\sum _{p=1}^{m} \sum _{q=1}^{n} (\eta {I_1}(p,q)-\overline{\eta {I_1}})(\eta {I_2}(p,q)- \overline{\eta {I_2}})\Big )^2}{{\sum _{p=1}^{m}\sum _{p=1}^{n} (\eta {I_1}(p,q)-\overline{\eta {I_1}})^2}\; \times \; \sqrt{\sum _{p=1}^{m}\sum _{p=1}^{n}(\eta {I_2}(p,q)-\overline{\eta {I_2}})^2}} \end{aligned}$$
(4)
Here, \(\overline{\eta {I_1}}\) and \(\overline{\eta {I_2}}\) shows the average of high sub-bands, respectively.
The dominated features are preserved in the obtained feature maps as:
$$\begin{aligned} F_{s}(p,q) = max(s{I_1} \times R_{N} + s{I_2} \times (1-R_{N})) \end{aligned}$$
(5)
Here, \(s{I_1}\) and \(s{I_2}\) show high sub-bands of \({I_1}\) and \({I_2}\), respectively.
Fusion of low sub-bands
Motivated from (Hermessi et al. 2018), local energy is used to fuse the low sub-bands as:
$$\begin{aligned} \chi _I(p,q) = \sum _{p' \in \gamma } \sum _{q' \in \delta }|I(p+p', q+q')| \end{aligned}$$
(6)
Here, \(I = {I_1}\; or\; {I_2}\). \(\gamma \; \times \; \delta \) represents neighbors of patch placed at (p, q). size of local patch is assigned as \(5 \times 5\). Fused coefficients of low sub-bands can be computed as:
$$\begin{aligned} \psi _f(p,q) = {\left\{ \begin{array}{ll} \psi {I_1}(p,q)\quad |\chi {I_1}(p,q)|\ge |\chi {I_2}(p,q)|\\ \psi {I_2}(p,q) \quad |\chi {I_1}(p,q)|< |\chi {I_2}(p,q)|\\ \end{array}\right. } \end{aligned}$$
(7)