Abstract
Multimodal medical image is an effective method to solve a series of clinical problems, such as clinical diagnosis and postoperative treatment. In this study, a medical image fusion method based on convolutional sparse representation (CSR) and mutual information correlation is proposed. In this method, the source image is decomposed into one highfrequency and one lowfrequency subband by nonsubsampled shearlet transform. For the highfrequency subband, CSR is used for highfrequency coefficient fusion. For the lowfrequency subband, different fusion strategies are used for different regions by mutual information correlation analysis. Analysis of two kinds of medical image fusion problems, namely, CT–MRI and MRI–SPECT, reveals that the performance of this method is robust in terms of five common objective metrics. Compared with the other six advanced medical image fusion methods, the experimental results show that the proposed method achieves better results in subjective vision and objective evaluation metrics.
Introduction
In recent years, medical imaging has become an indispensable means in clinical diagnosis, surgery, and radiotherapy. However, singlemodality medical images only focus on a certain type of morphological features. For example, computed tomography (CT) images reflect structural information about bone but are insensitive to soft tissues with a similar density. Although magnetic resonance imaging (MRI) has a strong ability to display soft tissues, it is poor in showing bone lesions and calcified lesions. Therefore, image fusion can obtain complementary information from different modalities of medical images and help clinicians perform postoperative detection and tumor and bone growth monitoring [1, 2].
Among many medical image fusion methods, a kind of method based on multiscale transform has attracted the attention of researchers, because it adopts a similar multiresolution processing mechanism to the human visual system. These methods include pyramidbased decomposition [3], waveletbased decomposition [4] and multiscale geometric analysis decomposition [5]. These methods exist two common problems: first, it is difficult to determine the decomposition level; Second, the fusion strategy is difficult to choose.
Decomposition level is a key problem to be solved. When the decomposition level is low, it cannot extract enough spatial details from the source map, whereas when the decomposition level is high, the fusion of highfrequency subband is more sensitive to noise and registration [6,7,8]. The general solution is simply to decompose an image into one highfrequency subband and one lowfrequency subband. The highfrequency subbands contain more details and edge information, whereas the lowfrequency subband contains the contour and structure information of images. In recent years, multiscale decomposition methods based on NSCT and NSST are popular because of their multiscale, multidirectional, and shift invariant. In particular, NSST has attracted more attention because of its superior computational efficiency to NSCT. Compared with pyramidbased methods, such as Gaussian pyramid decomposition, Laplacian pyramid decomposition [9], and gradient pyramid transformation [10], the method based on NSST can be decomposed from multiple directions, thus obtaining more image details. Compared with wavelet methods, such as discrete wave and dual tree complex wavelet, the method based on NSST can represent the curve and edge details of image well. Compared with multiscale geometric transformations, such as contourlet transform (COT) [11] and shearlet transform (ST) [12], the method based on NSST does not produce pseudo Gibbs phenomenon due to frequency aliasing. However, most of the existing NSST decomposition methods have a higher decomposition levels, which not only increases the amount of calculation, but also makes highfrequency subbands susceptible to noise. To preserve the structural information of the image as much as possible and to extract additional salient details, a new multiscale decomposition method is proposed in this study. Unaffected by the scale parameters of the general multiscale decomposition, this method uses NSST to decompose the image only in two scales, namely, one highfrequency subband and one lowfrequency subband. In addition to using convolution sparse representation to enhance the detailed information of highfrequency subband, correlation analysis should be used to extract the detailed information of lowfrequency subband due to the rich detailed information contained in lowfrequency subband. Sparse representation seeks to represent image features with as few sparse vectors as possible, which is widely used in image reconstruction and denoising. The improvement of convolution sparse representation is that the sparse coefficients of local image blocks are replaced by global sparse coefficients.
The fusion strategy is important for the quality of fused image. In multiscale decomposition, a common strategy is to measure the activity of the decomposition coefficients first and then fuse them in accordance with the mean or maximum value. For example, in [13, 14], high and the lowfrequency subbands adopt the maximum scheme for fusion. However, lowfrequency subbands provides structure information similar to the source image, whereas highfrequency subbands contain important details, thus, the same fusion scheme cannot consider the similarity and importance of the image simultaneously. In [15], a weighted average fusion strategy is adopted for similar regions of images, in this strategy, weight is calculated using the Siamese network. However, the definition of similar regions by this method directly affects the effect of the final image fusion. Recently, principal component analysis (PCA) [16], sparse representation [17, 18], smallest univalue segment assimilating nucleus (SUSAN) [19], and pulse coupled neural network (PCNN) [20, 21] have been used to enhance the salient information of fused images and measure the activity of decomposition coefficients. However, these methods have their own problems either in the selection of sparse dictionaries or in the training time. To obtain better fusion effect, different fusion strategies are selected according to different subbands in this study, namely, maximum fusion is used for the highfrequency subband and details of the lowfrequency subband, and weighted average fusion is used for the similar structural information in the lowfrequency subband.
This study focuses on the determination of decomposition scale and the fusion strategy of different frequency bands in NSST decomposition. To avoid the influence of high noise and registration on the fusion of highfrequency subband when the NSST decomposition scale is too high, this study only carries out onelevel decomposition of NSST, that is, one highfrequency subband and one lowfrequency subband. How to use mutual information correlation analysis to mine the detailed information in the lowfrequency subband is one of the research objectives of this study. It has been explained above that it is inappropriate for all subbands to adopt the same fusion strategy. Another objective of this study is to study which fusion strategy should be adopted for highfrequency subband, and similar and dissimilar regions of lowfrequency subband.
The main innovation points of this study include the following three aspects:

1.
The convolutional sparse representation (CSR) model is used to process the highfrequency subband, which increases the detailed features and reduces the block effect caused by NSST decomposition, as well as the redundant information of different source graphs.

2.
Mutual information correlation is used to extract detail information of lowfrequency subbands. Given that only twoscale decomposition is conducted, the lowfrequency subband contains abundant details. The mutual information correlation analysis can find the regions containing detailed information from the lowfrequency subband.

3.
Two different fusion strategies are used for lowfrequency subband. The structural information of similarity is fused using the weighted average scheme, where the weight takes the product of the correlation analysis coefficient and the regional energy sum. The Laplacian energy gradient was used to measure the activity of the dissimilar regions to reflect the contrast changes of the regions.
The remaining sections of this paper are organized as follows: the next section describes related work about NSST and CSR. The following section explains the methods in detail. In the next section, a comparative experiment is simulated, and the corresponding results are analyzed. The last section summarizes the study.
Related work
NSST
Nonsubsampled contourlet and nonsubsampled shearlet waves are two popular multiscale geometric decomposition methods, because they are multiscale, multidirectional, and shift invariant. Given that the nonsubsampled shearlet wave does not limit its direction and does not need to reverse the directional filter bank, its computational efficiency is higher than that of NSCT. NSST consists of two processes: nonsubsampled pyramid scale decomposition (NSPFs) and shiftinvariant shearlet filter banks (SFBS).
Figure 1 shows the framework of twolevel NSST decomposition. The input image is decomposed into a highfrequency subband and a lowfrequency subband after the firstlevel scale decomposition by NSPF, and then the lowfrequency subband is decomposed into the secondlevel highfrequency subband and lowfrequency subband. Therefore, the input image decomposed by Llevel NSST will be transformed into L highfrequency subbands and one lowfrequency subband. At each scale, multiple directions of subbands can be obtained by SFBs. Moreover, given that the traditional subsampling decomposition may bring frequency overlap, the pseudoGibbs phenomenon easily occurs. Thus, nonsubsampled decomposition is adopted in NSST [22].
Convolutional sparse representation
The idea of SR comes from the learning process of image structures by the receptor field of simple cells in the visual cortex V1 area. Given its simple representation, SR has been widely used in image denoising [23], feature extraction [24], and superresolution [25, 26]. Yang [27] and Yin [28] et al. applied SR to image fusion. The main difficulties in SR image fusion are sparse model selection and overcomplete dictionary learning. The common sparse model is based on a single image component and local patch, and the mathematical form is defined as
Here, \(y \in R^{n}\) is the stacked vector representation of the image patch \(\sqrt n \times \sqrt n\). \(D \in R^{n \times m}\) is an overcomplete dictionary, and \(x \in R^{m}\) is the sparse coefficient to be solved. However, the disadvantage of this model is that the sparse coefficient is obtained through the calculation of overlapping patches, thus, the global sparse coefficient of the whole image cannot be obtained. To improve fusion performance, Wohlberg [29] proposed the convolution form of SR. Liu [30] integrated morphological component analysis (MCA) and CSR into a unified optimization framework, which could realize multicomponent and global SR of source images simultaneously. CSR is given by the following equation:
Here, \(Y\) is the whole image, which is modeled as the sum of the convolution map between \(M\) local dictionary filters and global coefficients. the global single value and shift invariance features of CSR are conducive to extracting more detailed information and enhancing its robustness.
Another difficulty in SR is the learning of overcomplete dictionaries. In SR, the more fully the dictionary represents all the details of the image, the more likely the result of the reconstruction can restore the source image. The design of dictionaries usually adopts two methods: one is based on known transformation basis, such as discrete cosine transforms and wavelet basis. However, as data and application range change, the performance of such fixed dictionaries degrades considerably. The second is a learningbased approach. KSVD and its improved dictionary learning method are widely used in medical image fusion [31, 32]. The adaptive KSVD dictionary is constantly updated through iterative training, and it is updated alternately with sparse coding. The disadvantage is that the dictionary training time is long. In multimodal medical image fusion, the structure of medical images from different sensing devices is more complex, and data are more redundant. Therefore, dictionary learning based on joint block clustering is a better choice. By clustering similar patches of all source images, a complete and compact dictionary can be formed.
Proposed framework
As shown in Fig. 2, based on the general framework of image fusion in multiscale transformation domain, the proposed method mainly consists of three stages: multiscale decomposition, lowfrequency and highfrequency subband fusion, and NSST reconstruction. For simplicity, two source diagrams are used for illustration. First, NSST is applied to source image \(I_{a}\) and \(I_{b}\). After firstlevel decomposition, a highfrequency subband with significant details and a lowfrequency subband with structure information can be obtained. Second, a CSRbased maximum fusion strategy is used for the fusion of highfrequency coefficients. For the lowfrequency subband, mutual information is used for the correlation analysis, and then local energy coefficients are used for the weighted summation of similar parts, whereas domain energy gradient and maximum fusion are used for dissimilar parts. Finally, the image is reconstructed by NSST inverse solution.
Highfrequency subband fusion
Since the highfrequency subband contains the details of the image, the fusion of the highfrequency subband is mainly to fuse the salient features of the highfrequency subband. The advantage of convolution sparse representation is that it can describe these features with fewer sparse coefficients.
The sparse coefficients of the highfrequency subband of each source image are obtained by Eq. (2). Set \(X_{m}^{k}\) indicates the sparse coefficient of the highfrequency subband of the \(k\) image, \(X_{m,1:N}^{k} \left( {x,y} \right)\) indicates the content of the location \(\left( {x,y} \right)\); it is a \(N\) dimensional vector. Here, the norm \(L_{1}\) of \(X_{m,1:N}^{k} \left( {x,y} \right)\) is used to measure the activity level of the source image. Thus, the sparse coefficient fusion rule of highfrequency subband is defined as
The fused sparse coefficients are reconstructed by Eq. (4). The fused highfrequency subband is defined as
Lowfrequency subband fusion
Given the low decomposition scale, the lowfrequency subband still has abundant important information. To extract such information further, normalized mutual information (NMI) is used for correlation analysis. For the area with high correlation, the fusion strategy of local energy weighted summation is adopted to preserve energy as much as possible. For the area with low correlation, the neighborhood energy gradient (NEG) is adopted to highlight the contrast edge information of the source map as much as possible.
The basis of pixellevel image fusion is that the input images are linear and complementary. By correlation analysis of lowfrequency subband, the salient features of the source maps can be preserved. Mutual information is often used in multimodal image registration, which is a statistical correlation method based on gray value. The greater the mutual information between two images, the higher the correlation between the two images. The mutual information quantity of an image can be calculated by Kullback–Leibler Divergence, and the mathematical form is as follows:
\(P\left( x \right)\) and \(P\left( y \right)\) represent the probability distribution of random variables \(X\) and \(Y\), \(P\left( {x,y} \right)\) represents the joint distribution, \(H\left( {x,y} \right)\) represents the joint entropy of \(X\) and \(Y\), and the joint entropy reflects the correlation of random variables \(X\) and \(Y\). Here, \(X\) represents the image \(I_{a}\) and \(Y\) represents the image \(I_{b}\). The joint entropy of the two can be calculated using the joint histogram.
The mutual information result is easily affected by the clustering result with numerous clusters, thus, the NMI maps the mutual information to the interval range of [0, 1] and is defined as
After a \(3 \times 3\) sliding window partitioning, the correlation of the local region of the lowfrequency subband can be obtained. Here, the mutual information \(T\) of the whole image is regarded as the threshold of correlation. When the local region NMI is greater than the threshold \(T\), the image blocks are correlated. Therefore, weighted average scheme is used. In medical image fusion, the intensities of different source images at the same location may vary magnificently, because the source images are captured with different imaging mechanisms. Therefore, the weight matrix cannot be evaluated with a simple average. Here, we use the center pixel energy to calculate the matrix, the center pixel energy is an adaptive weight calculate method based on region energy, and the mathematical form is defined by the following equation:
Here, \(N\) is the radius of the local region \(\left( {2N + 1, 2N + 1} \right)\), \(L_{m}\) is the lowfrequency coefficients of \(m\) image, and \(W_{L}\) indicates the weight of the local pixel. Given that the lowfrequency subband image is relatively smooth, the weight can be directly represented in \(2^{2N  d}\), where \(d\) is the distance from the field pixel to the center point, if \(N = 1\), then normalized \(W_{L}\) is defined as
For the lowfrequency subband with high correlation, the coefficient blocks are fused by the strategy of the weighted sum of energy of the center pixel and are defined as
For the lowfrequency subband with low correlation, that is, when the NMI of the local region is less than the threshold \(T\), the fusion of the coefficient blocks adopts the strategy of maximum energy gradient. NEG is essentially the Sum of Laplace energy, which is a parameter characterizing image edge features. NEG reflects the contrast change of the neighbor window and the edge information of the image block and is defined as
Here, Ω indicates a neighbor window, after the activity is measured by NEG, the maximum value is taken for the fusion of lowfrequency coefficients, and the equation is as follows:
The detailed description of the algorithm is shown in Algorithm 1.
Experiment and analysis
Comparison algorithms
Nine medical image fusion methods, which have been proposed in recent years, are compared with the proposed method. These methods are based on SR or multiscale transformation and include LP–ASR [33], SR–NSCT [34], parameteradaptive and pulsecoupled neural network (PA–PCNN) [13], PC–LLE [14], IOI–LLF [35], CNN–CP [15], CoF–MLE–NSST [36], PSO–NSST [37] and PCNN–NSST [38]. The LP–ASR method is based on Laplacian Pyramid decomposition and adaptive SR, and the sparse coefficient fusion scheme is used to reduce the noise of the highfrequency components. The SR–NSC method incorporates NSCT into the SR fusion framework, and different fusion strategies are used for low and highfrequency coefficients. The PA–PCNN method first performs NSST decomposition on the source images, and then a PA–PCNN model is used in the highfrequency subband fusion. After NSCT decomposition in the PC–LLE method, the highfrequency subbands are fused by the phase consistency rule. The image of Interest–Laplacian filter (IOI–LLF) method uses local LLF to decompose the source image into residual and Ground images and further decomposes the residual image based on IOI. The CNN–CP method uses a trained Siamese convolution network to fuse the pixel activity information of the source image and generate a weight map. The first two methods are based on SR, whereas the last four methods are all multiscale decomposition methods.
Objective evaluation metrics
To evaluate the performance of the various methods, five widely recognized object metrics, namely, entropy (EN) [39], structural similarity (\(Q_{e}\)) [40], mutual information (MI) [41], gradient (\(Q^{{{\raise0.7ex\hbox{${{\text{AB}}}$} \!\mathord{\left/ {\vphantom {{{\text{AB}}} {\text{F}}}}\right.\kern\nulldelimiterspace} \!\lower0.7ex\hbox{${\text{F}}$}}}}\)) [42] and the human eye visual perception (VIF) [43], are used in the experiment. EN can reflect the amount of information contained in the fused image; \(Q_{e}\) represents the degree of similarity between the fused and source images; MI is a mutual information indicator used to measure the information contained in the fused image; \(Q^{{{\raise0.7ex\hbox{${{\text{AB}}}$} \!\mathord{\left/ {\vphantom {{{\text{AB}}} {\text{F}}}}\right.\kern\nulldelimiterspace} \!\lower0.7ex\hbox{${\text{F}}$}}}}\) is a quality metric based on gradient, which is mainly used to measure the edge information of fused images; VIF is the information ratio between the fused image and the source image and is used to evaluate the human visualization performance of the fused image.
Experimental settings
In this experiment, 10 groups of CT–MRI images and 10 groups of MRI–SPECT images are used in fusion performance tests. As shown in Fig. 3, the first row (a) shows two sets of CT–MRI images. The second row (b) shows two sets of MRI–SPECT images. these images are from the Whole Brain Atlas provided by Harvard Medical School; each image has a resolution size of 256 × 256. All experiments are programmed by MATLAB 2014a, and the simulation experiment environment is Intel(R) Core(TM) I78565U CPU @ 1.80 GHz and 8.00 GB RAM.
Experimental results
Figure 4 shows the results of two groups of CT–MRI images obtained by different fusion methods. LP–ASR and SR–NSCT lose part of the energy of CT images and reduce the contrast of fusion images; PC–LLE and CNN–CP lose part of the information in the MRI source images, whereas noiselike artifacts exist in IOI–LLF fusion images. The fusion images of PA–PCNN, PCNN–NSST and the proposed method have better contrast and edge detail information.
Figure 5 shows the results of two sets of MRI–SPECT images obtained by different fusion methods. Among them, LP–ASR and SR–NSCT lose part of MRI information, and local image distortion exists. The IOI–LLF, PC–LLE and CNN–CP methods have complete texture information, but some SPECT functional information is missing. The fusion effect of PA–PCNN, PCNN–NSST and the proposed method is better subjectively.
To evaluate the performance of each fusion method objectively, Tables 1 and 2, respectively, show the average scores of the CT–MRI and MRI–SPCET fusion results. The higher the index value, the better the fusion performance, where the highest score is indicated in bold, and the lowest score is indicated by subscript. In addition, the performance of the proposed method is compared with that of several recent NSST methods. Among them, COF–MLE–NSST method uses cooccurrence filter to measure the activity of lowfrequency subband coefficient, PSO–NSST method uses particle swarm optimization algorithm to optimize the membership function of lowfrequency subband fuzzy logic system, and PCNN–NSST method uses PCNN to fuse highfrequency subband. Compared with the other nine methods, the proposed method ranks first in \(Q_{e}\), MI, and \(Q^{{{\raise0.7ex\hbox{${{\text{AB}}}$} \!\mathord{\left/ {\vphantom {{{\text{AB}}} {\text{F}}}}\right.\kern\nulldelimiterspace} \!\lower0.7ex\hbox{${\text{F}}$}}}}\) for CT–MRI and MRI–SPECT images, indicating that it preserves most of the structure information in the source images and keeps the edge of the source image and structure well. At the same time, because the method based on transformation domain is accompanied by the loss of a certain amount of information, the ranking of EN and VIF is not the highest, but the ranking is still relatively high, indicating that the proposed method has good robustness. The proposed method is inferior to the PA–PCNN method in VIF, because the latter adopts a neuron perceptron similar to that of humans.
To compared the computational costs of different fusion methods, the total time of 10 groups of CT–MRI fusion images is first calculated and then divided by 10 to obtain the average running time. The calculation was repeated 10 times, and the results and standard deviations are shown in Table 3. The proposed method is inferior to the LP–ASR and PC–LLE methods but superior to the other four methods. Particularly, the performance of the proposed method is similar to that of PA–PCNN, but the computational efficiency is higher, because the iteration process of PCNN is time consuming. Here, the IOI–LLF method has the lowest calculation efficiency, because IOI takes too much time.
The proposed algorithm’s performance was also evaluated by changing the value of parameters used in the proposed method, such as NSST decomposition level, and the directions number. These values are obtained over 20 pairs of multimodality medical images, and the average outcomes are shown in Table 4. From Table 4, it can be analyzed that as the decomposition level and directions are increasing, the values of En and MI are also increased. The values of \(Q_{e}\), \(Q^{{{\raise0.7ex\hbox{${{\text{AB}}}$} \!\mathord{\left/ {\vphantom {{{\text{AB}}} {\text{F}}}}\right.\kern\nulldelimiterspace} \!\lower0.7ex\hbox{${\text{F}}$}}}}\) and VIF are optimal when Level = 3. In general, with the increase of level, the value of each metrics increases slightly.
Conclusions
In this study, we propose a multimodal medical image fusion method based on NSST and mutual information correlation analysis. Based on NSST scale decomposition, this method uses CSR to enhance the highfrequency detail information and uses mutual information correlation to mine the detail information of lowfrequency subband. Then, different fusion strategies are adopted for different areas of lowfrequency subband according to correlation. To achieve this goal, two new activity level measurement methods based on the domain energy gradient and central pixel energy sum are designed. By comparing with other advanced methods and numerous experiments, the effectiveness of the proposed method is proven. However, the method still has the following limitations: first, the setting of the threshold of the lowfrequency subband correlation analysis has a certain influence on the final fusion effect. If the threshold is set too small, then the extraction of detail information is insufficient; if the threshold is set too large, then meaningless details in the MRI image are introduced into the fused image, causing artifacts. In this study, the mutual information of the whole source image is used as the threshold value for the correlation analysis of lowfrequency subbands; this strategy is not an optimal scheme. In addition, Table 3 shows that this method is not as fast as some fusion methods, because the local mutual information correlation is calculated by sliding window, resulting in low calculation efficiency. In the future, we will devote ourselves to the research of a more effective threshold determination scheme by combining the prior information of source images.
References
Ganasala P, Kumar V (2016) Featuremotivated simplified adaptive PCNNbased medical image fusion algorithm in NSST domain. J Digit Imaging 29:73–85
Qi G, Wang J, Zhang Q, Zeng F, Zhu Z (2017) An integrated dictionary learning entropybased medical image fusion framework. FutureInternet 9(4):61
Petrovic V, Xydeas C (2004) Gradientbased multiresolution image fusion. IEEE Trans Image Process 13:228–237
Sundar K, Jahnavi M, Lakshmisaritha K (2017) Multisensor image fusion based on empirical wavelet transform. In: 2017 international conference on electrical, electronics, communication, computer, and optimization techniques (ICEECCOT). IEEE, pp 93–97
Liu Y, Liu S, Wang Z (2015) Multifocus image fusion with dense SIFT. Inf Fusion 23:139–155
Xia J, Lu Y, Tan L et al (2021) intelligent fusion of infrared and visible image data based on convolutional sparse representation and improved pulsecoupled neural network. Comput Mater Continua 67(1):613–624
Yuan G, Ma S, Liu J et al (2021) Fusion of medical images based on salient features extraction by PSO optimized fuzzy logic in NSST domain. Biomed Signal Process Control 69(12):102852
Ouerghi H, Mourali O, Zagrouba E (2020) Multimodal image fusion based on weight local features and novel summodifiedLaplacian in nonsubsampled Shearlet transform domain. In: International symposium on visual computing
Shen J, Zhao Y, Yan S, Li X (2014) Exposure fusion using boosting Laplacian pyramid. IEEE Trans Cybern 44:1579–1590
Chen G et al (2019) Weighted sparse representation and gradient domain guided filter pyramid image fusion based on lowlightlevel dualchannel camera. IEEE Photon J 99:1
Li GX, Wang K (2007) Color image fusion algorithm using the contourlet transform. Acta Electron Sin 35:112
Miao QG, Cheng S, Xu PF et al (2011) A novel algorithm of image fusion using shearlets. Opt Commun 284(6):1540–1547
Yin M, Liu X, Liu Y et al (2018) Medical image fusion with parameteradaptive pulse coupled neural network in nonsubsampled shearlet transform domain. IEEE Trans Instrum Meas 68(1):49–64
Zhu Z, Zheng M, Qi G et al (2019) A phase congruency and local laplacian energy based multimodality medical image fusion method in NSCT domain. IEEE Access 7:20811–20824
Wang K, Zheng M, Wei H et al (2020) Multimodality medical image fusion using convolutional neural network and contrast pyramid. Sensors 20(8):2169
Shahdoosti HR, Ghassemian H (2016) Combining the spectral PCA and spatial PCA fusion methods by an optimal filter. Inf Fusion 27:150–160
Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multiscale transform and sparse representation. Inf Fusion 24:147–164
Wang K, Qi G, Zhu Z, Chai Y (2017) A novel geometric dictionary construction approach for sparse representation based image fusion. Entropy 19:306
Garaigordobil A, Ansola R, Veguería E et al (2019) Overhang constraint for topology optimization of selfsupported compliant mechanisms considering additive manufacturing. Comput Aided Design 109:33–48
Subashini MM, Sahoo SK (2014) Pulse coupled neural networks and its applications. Expert Syst Appl 41(8):3965–3974
Wang M, Shang X (2020) An improved simplified PCNN model for salient region detection. Vis Comput 10–12:1–13
Easley G, Labate D, Lim WQ (2008) Sparse directional image representations using the discrete shearlet transform. Appl Comput Harmon Anal 25(1):25–46
Kim M, Han DK, Ko H (2016) Joint patch clusteringbased dictionary learning for multimodal image fusion. Inf Fusion 27:198–214
Liu H, Liu Y, Sun F (2015) Robust exemplar extraction using structured sparse coding. IEEE Trans Neural Netw 26:1816–1821
Yang J, Wright J, Huang TS, Ma Y (2010) Image superresolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873
Dong W et al (2011) Image deblurring and superresolution by adaptive sparse domain selection and adaptive regularization. IEEE Trans Image Process 20(7):1838–1857
Yang B, Li S (2010) Multifocus image fusion and restoration with sparse representation. IEEE Trans Instrum Meas 59(4):884–892
Yin H, Li S, Fang L (2013) Simultaneous image fusion and superresolution using sparserepresentation. Inf Fusion 14:229–240
Wohlberg B (2015) Efficient algorithms for convolutional sparse representations. IEEE Trans Image Process 25(1):301–315
Liu Y, Chen X et al (2019) Medical image fusion via convolutional sparsity based morphological component analysis. IEEE Signal Process Lett 26(3):485–489
Aharon M, Elad M, Bruckstein A (2006) KSVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54:4311–4322
Dong W et al (2013) Sparse representation based image interpolation with nonlocal autoregressive modeling. IEEE Trans Image Process 22(4):1382–1394
Wang Z, Cuia Z, Zhu Y (2020) Multimodal medical image fusion by Laplacian pyramid and adaptive sparse representation. Comput Biol Med 123:103823
Li Y, Sun Y, Huang X et al (2018) An image fusion method based on sparse representation and sum modifiedLaplacian in NSCT domain. Entropy 20(7):522
Jiao D, Li W, Xiao B (2017) Anatomicalfunctional image fusion by information of interest in local Laplacian filtering domain. IEEE Trans Image Process 12:1–1
Diwakar M, Singh P, Shankar A (2021) Multimodal medical image fusion framework using cooccurrence filter and local extrema in NSST domain. Biomed Signal Process Control 68(12):102788
Yuan GA et al (2021) Fusion of medical images based on salient features extraction by PSO optimized fuzzy logic in NSST domain. Biomedical Signal Process Control 69:102852
Wei T et al (2020) Multimodal medical image fusion algorithm in the era of big data. Neural Comput Appl 3:1–21
Cvejic N, Canagarajah C, Bull D (2006) Image fusion metric based on mutual information and Tsallis entropy. Electron Lett 42:626–627
Zhang XL, Li XF, Li J (2014) Validation and correlation analysis of metrics for evaluating performance of image fusion. Acta Autom Sin 40(2):306–315
Qu G, Zhang D, Yan P (2002) Information measure for performance of image fusion. Electron Lett 38:313–315
Petrović V (2007) Subjective tests for image fusion evaluation and objective metric validation. Inf Fusion 8:208–216
Bovik HA (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430
Acknowledgements
This project is supported by the Provincial Natural Science Foundation of Hunan, China (Grant No. 2020JJ6021), the Research Foundation of Education Bureau of Hunan Province, China (Grant No. 21A0451, No. 19C0483), Construct Program of the Key Discipline in Hunan Province: Control Science and Engineering.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Guo, P., Xie, G., Li, R. et al. Multimodal medical image fusion with convolution sparse representation and mutual information correlation in NSST domain. Complex Intell. Syst. 9, 317–328 (2023). https://doi.org/10.1007/s40747022007929
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747022007929
Keywords
 Medical image fusion
 NSST
 Convolution sparse representation
 Mutual information correlation