Introduction

IMAGE enhancement is one of the key challenges in image processing field. Histogram equalization (HE) [1, 2] is a popular classical technique for image enhancement due to its simplicity and effectiveness [3]. As such, several varieties of HE techniques have been proposed [4,5,6,7,8]. Global Histogram Equalization (GHE) utilizes a transfer function that transform the histogram information of the whole image to the desired output image [5]. Despite the success of these global approaches for overall image enhancement, these techniques still fail to conform with the local brightness feature. Therefore, a better technique was proposed to solve such brightness problem using local histogram equalization (LHE) [6]. Unfortunately, LHE suffers from high computational complexity due to it overlapping sliding mask technique. Advancement in technology has provided high processing power to overcome the problem of high computational complexity, yet LHE is still affected by the existence of amplified noise. Several better approaches [9, 10] were proposed to overcome the problem associated with LHE with the use of a non-overlapped or partially overlapped block based HE technique. Nonetheless, all of these techniques produce an unwanted checkboard effect on the enhanced image [11, 12].

Recently, many researchers have proposed several image enhancement techniques [13,14,15,16,17]. Chitchian et al. [13] proposed a double density complex discrete wavelet transform (DD-CDWT) for medical image enhancement. The framework combined the double-density wavelet transform and dual-tree complex wavelet transform for the denoising and the enhancement process. Anantrasirichai et al. [14] developed a multi-scale enhancement framework using a dual-tree complex wavelet transform. The framework performed the image enhancement process using a smoothing operation that adopts a novel adaptive-weighted bilateral filter (AWBF). The AWBF framework consists of intensity adjustment, wavelet-based despeckling, and adaptive-weighted bilateral filtering which employed adaptive weights using local entropy. Liu et al. [15] proposed a medical image enhancement framework that utilized the collaborative shock filtering technique. The method denoised the image by a collaborative filter with a new similarity measure, followed by a gamma distribution. The denoised images were sharpened by a shock-type filter for edge and detail enhancement. For more research on image enhancement, readers can see [18,19,20,21,22,23]. Unfortunately, the major drawbacks of some of these existing frameworks are they are difficult to implement, less robust, do not preserve micro-level structures, and suffer from high computational complexity. Additionally, most of the existing frameworks are monomodal and vendor’s specific, which limit their usage across different machines. As a result, we propose a robust multi-vendor and multimodal framework that is computationally efficient, and easy to implement, with high structural and edge preservation capability.

In this work, we propose a novel local transfer function based on neighborhood similarity index (LTF-NSI) for medical image enhancement. The proposed algorithm utilizes the similarity index of the intensity distribution between adjoining pixels and the maximum grey level from the image histogram. The generated similarity index value determines the degree of similarity between the intensity distribution and the adjoining pixels, which in turns regulate the contrast and brightness degree of the image. To our knowledge, this study is among the first to propose a multi-vendor and multimodal framework to enhance medical images. Our objective is to improve the perception and interpretability of information in the medical images to support effective clinical treatment, disease monitoring and progression. More so, it additionally provides a better input option for medical algorithms such as automated segmentation frameworks. We anticipated that clinicians and computer vision programmers will benefit from the proposed LTF-NSI framework. The main contributions of the proposed LTF-NSI algorithm include development of (1) a novel local transfer function that integrate neighborhood information using the degree of similarity index value, (2) a novel neighborhood similarity index is designed based on the similarity relation matrix, that integrate the statistical information of the image, (3) an effective optimization of the algorithm parameters, and (4) significantly low computational complexity with real time operation.

Background

The proposed method combined two techniques that differs in computation and implementation. We used the transfer function and the dragon optimization techniques.

Transfer function technique

Digital images are made up of a matrix of pixels, and each possessing at least three dimensions: two (or more) spatial and one intensity value. Transfer function technique is the altering of the image pixel values for optimum display, i.e., converting old pixel values to a new range. If the high and low pixel values that covers the range of useful information in an image are determined, then the transfer function can be used to control or manipulate the pixel values between those limits. Transfer functions are core techniques that help in changing the brightness scale of an image to display information about important features clearly. Mathematically, the relationships between the old \((p)\) and new pixel values \((q)\) can be determined using the transfer function as

$$q=f\left(p\right),$$
(1)

the function is embodied in the operator \(f\). It corresponds to a mathematical operation that should be applied to the input values to produce the desired output values.

There are a lot of different transfer functions. The only constraint is that for each old pixel value, there must be only one new pixel value, i.e., the transfer function must be single valued. Transfer function usage depends on the image type, and which imprints information or features that need to be emphasized or displayed. The most common types are the linear [24], logarithmic [25], gamma [26] and inverse linear (negative) transfer functions [27]. Transfer functions can help increase the contrast, and in turn, use more of the available range of brightness values. For detail study on transfer function, readers can see [28,29,29,30,31].

Dragonfly optimization technique

Dragonfly algorithm (DA) is a powerful optimization algorithm that mimicked the swarming behaviors of a dragonfly. It is one of the most recently developed heuristic optimization algorithms proposed by Mirjalili et al. [32]. The main reason for their swarming is either hunting (dynamic or static) or migration. During static swarming, small segments of dragonflies move over a small area to hunt other insects. This swarming behavior of the dragonfly includes local movements and abrupt changes. However, in dynamic swarming, a large number of dragonflies form a single group and move in one direction for a long distance [33]. These swarming behaviors mentioned above formed the main inspiration of the DA. Both the dynamic and static swarming behaviors of the dragonfly are in line with the exploitation and exploration phases of the metaheuristic optimization algorithm. To guide artificial dragonflies to different paths, six weights were used namely, separation weight \((s)\), alignment weight \((a)\), cohesion weight \((c)\), enemy factor \((e)\), food factor \((f)\) and the inertia weight \((\mu )\). For search space exploration, high alignment and low-cohesion weights are used, while for search space exploitation, low alignment and high-cohesion weights are used. In addition, a direct proportionality of the radii of neighborhood to the number of iterations are used to swift between exploitation and exploration. These swarming weights \((s, a, c, f, e, \text{and } \mu )\) are tunned adaptively during the optimization process to balance the exploration and exploitation processing. These weight factors are mathematically expressed from Eqs. 26.

The separation \((S)\) can be calculated according to Reynolds [34] as:

$${S}_{i}=-\sum_{j=1}^{N}X-{X}_{j},$$
(2)

\(X\) represents the position of the current individual dragonfly, \({X}_{j}\) is the position for the \({j}^{\mathrm{th}}\) neighboring dragonfly, \(N\) is the number of individual neighbours of the dragonfly swarm, and \(S\) indicates the separation motion for the \({i}^{th}\) individual dragonfly.

According to [32], the alignment \((A)\) is calculated as

$${A}_{i}=\frac{\sum_{j=1}^{N}{V}_{j}}{N},$$
(3)

where \({A}_{i}\) is the alignment motion for \({i}^{\mathrm{th}}\) individual and \({V}_{j}\) is for the velocity of the \({j}^{\mathrm{th}}\) neighbouring dragonfly.

Cohesion \((C)\) is expressed as follows:

$${C}_{i}=\frac{\sum_{j=1}^{N}{X}_{j}}{N}-X.$$
(4)

\({C}_{i}\) is the cohesion for \({i}^{\mathrm{th}}\) individual, \(N\) is the neighbourhood size, \({X}_{j}\) is the position of the \({j}^{\mathrm{th}}\) neighboring dragonfly, and \(X\) is the current dragonfly individual.

Food attraction \((F)\) motion is expressed as follows:

$${F}_{i}={X}^{+}-X,$$
(5)

where \({F}_{i}\) is the food attraction for \({i}^{\mathrm{th}}\) dragonfly, \({X}^{+}\) is the position of the source of food, and \(X\) is the position of the current individual dragonfly. The food is the dragonfly that has the best objective function in this operation.

Distraction \((E)\) outwards predators is obtained as:

$${E}_{i}={X}^{-}-X,$$
(6)

\({E}_{i}\) is the enemy’s distraction motion for the \({i}^{\mathrm{th}}\) individual, \({X}^{-}\) is the enemy’s position, and \(X\) is the position of the current individual dragonfly.

Data acquisition

A total of 400 medical images, consisting of 100 images each of X-ray, computed tomography (CT), optical coherence tomography angiography (OCTA), and fluorescein angiography (FA) with different diseases were used. This study was conducted in accordance with the Declaration of Helsinki and adhered to Good Clinical Practice guidelines. Approval for the protocol was obtained from the local ethics committee for each participating site. Written informed consent was obtained from all study patients.

Methods

The proposed LTF-NSI algorithm is depicted in Fig. 1. In practice, local transformation function typically generates a new intensity value for each pixel in the input image, which is defined as

$$E\left(x,y\right)=T\left[I\left(x.y\right)\right],$$
(7)

\(E\left(x,y\right)\) is the output image, \(T\) is the transformation function applied at \(\left(x,y\right)\) pixel, and \(I\left(x, y\right)\) is the input image.

Fig. 1
figure 1

Overview of the proposed LTF-NSI algorithm for medical image enhancement

The proposed LTF-NSI locally transformed each pixel in the input image, by utilizing the density distribution similarity between the center pixel and its neighboring pixels. As such, we defined the proposed LTF-NSI mathematically as:

$$E\left(x,y\right)=\left|\alpha \left(x, y\right)-l*\omega \right|+\vartheta {(x,y)}^{i}*I\left(x,y\right),$$
(8)

where \(\alpha \) is the enhancement function, \(\omega \) is the value of the most used gray level, and \(\vartheta \) is the similarity index. The enhancement function \(\alpha \) is obtained as

$$\alpha \left(x,y\right)=\frac{q*\beta }{\sigma \left(x, y\right)+j},$$
(9)

\(\beta \) is the average global value of the pixels in a particular local region, and \(\sigma \) is the standard deviation. We defined \(\beta \) as

$$\beta =\frac{1}{U\times V}\sum_{x=0}^{U-1}\sum_{y=0}^{V-1}I(x, y)$$
(10)

and \(\sigma \) is defined as

$$\sigma \left(x, y\right)=\sqrt{\frac{1}{v\times v}\sum_{a=0}^{v-1}\sum_{b=0}^{v-1}{\left(I\left(x, y\right)-{L}_{avg}\right)}^{2},}$$
(11)

\({L}_{avg}\) is the local average value of the pixel in a specific pixel block. Where \(i,j, l,\) and \(q\) are optimized parameters obtained through DA. To obtain the similarity measurement between two pixels \(i\) and \(j\), the distance \({d}_{ij}\) between the two pixels is \({d}_{ij}={d}_{i}-{d}_{j}\). As such, the similarity value between the two pixels \(i\) and \(j\) is obtained as

$${S}_{(i, j)}={exp}^{\left(\frac{{d}_{ij}}{N}\right)},$$
(12)

\({d}_{ij}\) is the distance between the two pixels \(i\) and \(j\) within the mask, and \(N\) is the normalization coefficient. To determine the similarity value of two pixels \(i\) and \(j\), a \(3\times 3\) mask is sufficient to provide adequate neighborhood information about the center pixel \(i\) or \(j\). As such, \(9\) pixels values are generated from each of the \(3\times 3\) mask of \(i\) and \(j\) pixels. These \(9\) pixels values each from \(i\) and \(j\) pixels require a \(9\times 9\) matrix. Therefore, it is necessary to use a \(9\times 9\) mask to construct the similarity relation matrix. A \(9\times 9\) mask is used to create the similarity relation matrix. The neighborhood similarity index \(\vartheta \) is obtained as

$$\vartheta (x, y)=\frac{1}{81}*\sum_{i=1}^{9}\sum_{j=1}^{9}{S}_{ij},$$
(13)

where \({S}_{ij}\) is the similarity value of the two pixels \(i\) and \(j\), ranging from 0 to 1.

To optimize the LTF-NSI parameters (\(i,j, l,\) and \(q\)), we applied dragonfly technique for the heuristic optimization process. For this purpose, two crucial phases of optimization (exploitation and exploration) are designed by modelling the social interaction of dragonflies in maneuvering, searching for foods, and evading enemies when swarming statically or dynamically. The key features of these dragonflies include separation, alignment, cohesion, food, and enemy. The dragonflies separate to prevent clashing with neighboring members, while alignment signifies the speed compliance, and cohesion defines the navigation of an individual dragonfly towards the center of the neighboring subject. We remodel the dragonfly algorithm by incorporating additional two vectors namely step vector (\(\Delta \partial \)) and position vector (\(\partial \)). As such, we redefined step vector as:

$$\Delta {\partial }_{t+1}= \left(s{S}_{i}+a{A}_{i}+c{C}_{i}+f{F}_{i}+e{E}_{i}\right)+\mu \Delta {\partial }_{t},$$
(14)

where \(s, a, c, f,\) and \(e\) are the separation, alignment, cohesion, food, and enemy factors for the \(i\)th dragonfly at \(t\) iteration. Additionally, \(\mu \) is the inertial weight. Therefore, \({\partial }_{t+1}\) is defined as

$$ {\partial }_{t+1}={\partial }_{t}+\Delta {\partial }_{t+1}.$$
(15)

To improve the stochastic behavior, dragonfly discovery, and randomness properties of the algorithm, we utilized the L\({e}\) vy flight technique [35] to update the position of the dragonflies using the modified equation as

$$ {\partial }_{t+1}={\partial }_{t}+L{\acute{e}}vy*{\partial }_{t}.$$
(16)

The pseudocode of the modified DA used in this work is shown in Algorithm 1.

figure a

A \(9\times 9\) mask is used to scanned across the input image and the neighborhood similarity index is used to generate the similarity image (Fig. 2b) based on Eq. 13. Furthermore, we applied standard deviation of Eq. 11 to obtain the most used gray level value in the image. These values were obtained before the search operation since these values do not change during the search operation. In addition, we set the number of agents and iterations of the dragonfly optimization algorithm. As such, the result from this operation provides the values for \(i,j, l,\) and \(q\) for each search agent. We model a fitness operation as an objective evaluation criterion that determines the quality of the enhanced image during the operation as

$$\mathrm{Fitness}=\frac{\mathrm{log}(\mathrm{log}(\mathrm{Sobel}(I)\times {\mathrm{Edge}}_{\mathrm{no}}(\left(I\right)\times H\left(I\right))}{M\times N},$$
(17)

where \(I\) is an image, \(\mathrm{Sobel}\) represents the sobel value of the image, \(H\) is the entropy, \({\mathrm{Edge}}_{\mathrm{no}}\) is the number of edge pixels, and \(M\) and \(N\) are the image dimension. After assessment by the evaluation function, the DA is updated as shown in Algorithm 1. This operation is repeated until the stopping criterion is met. The optimized values for \(i,j, l,\) and \(q\) are obtained once the stopping criterion is met and are inserted into the LTF-NSI model in Eq. 8 to obtain the enhanced image.

Fig. 2
figure 2

Results of the proposed LTF-NSI based on the four image modalities. a Original images, b generated similarity image based on the neighborhood similarity index, and c enhanced images generated by the proposed algorithm

Result and analysis

Qualitative analysis

The results of the proposed LTF-NSI algorithm are presented in Fig. 2. These results show the enhancement effect of the proposed algorithm on the low contrast medical images. The proposed algorithm preserved the edges, details, and micro-level structures that are important in the analysis of medical images for clinical diagnosis. These results show that the proposed algorithm is effective in enhancing the images, even with the difficulty in different image modalities and vendors.

The comparative results of the proposed algorithm with the state-of-the-art methods are shown in Fig. 3. Figure 3a is a low-contrast X-ray image, which is difficult for clinicians to utilize unless it is enhanced for proper visualization and data extraction. Figure 3b shows the CT image modality, that are crucial in brain imaging. For this image modality, micro-level structures and edges are difficult to visualize. Figure 3c shows the OCTA image, which is usually an inhomogeneous and low-quality image. It is often difficult to enhance such image due to the high level of noise and the low-quality of the image. Figure 3d shows the FA image, which is usually an image with quality better than both X-ray and OCTA images, but still suffers from low contrast. For adequate visualization of the enhancement effect of each of the algorithm, each of the results can be enlarged to observe the detail preservation, micro-level structure, structural edge, and the overall image quality improvement. Visual observation through image enlargement is an important process, since the clinicians use image enlargement process to observe features, structures, edges, etc. during their medical diagnosis. Additionally, we also compare the proposed algorithm with the state-of-the-art methods for the low-light image enhancement framework modality proposed by Li et al. [16] and Ying et al. [17], due to the low visibility of the medical images.

Fig. 3
figure 3

Comparative results of the proposed algorithm with the state-of-the-art methods. a X-ray image, b CT image, c OCTA, and d FA image

Quantitative analysis

For effective quantitative analysis, we utilized various evaluation metrics to evaluate different aspects of the results of the proposed algorithm. Both subjective and objective based assessments were performed to ascertain the efficiency of the proposed algorithm.

Objective-based assessment

We used the standard evaluation metrics namely, absolute mean brightness error (AMBE), measure of enhancement.

(EME), structure similarity (SSIM), correlation (\(\rho \)) for the performance evaluation. AMBE is defined as

$$\mathrm{AMBE}=\left|E\left(X\right)-E(Y)\right|,$$
(18)

where \(X\) is the input image, \(Y\) is the enhanced image and \(E\left(\bullet \right)\) is the mean value function.

EME is defined as

$${\mathrm{EME}}_{{k}_{1}{k}_{2}}\left(\varnothing \right)=\frac{1}{{k}_{1}{k}_{2}}\sum_{l=1}^{{k}_{1}}\sum_{k=1}^{{k}_{2}}20In\frac{{I}_{\mathrm{max};k,l(\varnothing )}}{{I}_{\mathrm{min};k,l\left(\varnothing \right)+c}},$$
(19)

where an image \(X(m,n)\) is divided into \({k}_{1}{k}_{2}\) regions, \((\varnothing )\) is an orthogonal transform, \({I}_{max;k,l}\) and \({I}_{min;k,l}\) are defined as the maximum and minimum intensity values and \(c\) is a constant.

SSIM is defined as

$$\mathrm{SSIM}\left(p,\ddot{p}\right)=\frac{(2{\mu }_{p}{\mu }_{\ddot{p}}+{a}_{1})(2{\sigma }_{p\ddot{p}}+{a}_{2})}{({\mu }_{p}^{2}+{\mu }_{\ddot{p}}^{2}+{a}_{1})({\sigma }_{p}^{2}+{\sigma }_{\ddot{p}}^{2}+{a}_{2})},$$
(20)

where \({a}_{1}\) and \({a}_{2}\) are constant,\({a}_{1}=({k}_{1}L)\), \({a}_{2}=({k}_{2}L)\), where \({k}_{1}\) and \({k}_{2}\) are constant set at \(0.01\) and \(0.03\), and \(L\) is the dynamic range of the pixel values.

\(\rho \) is defined as

$$\rho =\frac{\Pi (p-{\mu }_{p},\ddot{p}-{\mu }_{\ddot{p}})}{\sqrt{\Pi (p-{\mu }_{p}, p-{\mu }_{p})\bullet\Pi (\ddot{p}-{\mu }_{\ddot{p}},\ddot{p}-{\mu }_{\ddot{p}}})},$$
(21)

where \(p\) is the original image and \(\ddot{p}\) is the enhanced image, and \(\Pi \) is defined as

$$\Pi \left({p}_{1},{p}_{2}\right)=\sum_{(i,j)\epsilon ROI}{p}_{1}\left(i,j\right)\bullet {p}_{2}\left(i,j\right),$$
(22)

where \(i\) and \(j\) are the coordinates.

In addition, two standard error metrics namely, mean square error (MSE), and peak signal to noise ratio (PSNR) were additionally used to evaluate the performance of the proposed algorithm. Tables 1 and 2 present the comparative results of the proposed algorithm with the state-of-the-art methods, while the best results are shown in bold. More so, we also present the average time required by each of the algorithm in Tables 1 and 2.

Subjective-baed assessment

We further evaluate the efficacy and potential usability of the proposed algorithm, by conducting the mean opinion score (MOS) analysis [36] to compare the efficiency of the proposed algorithm with the state-of-the-art methods. For this purpose, three experts were asked to rate the enhanced image results of the eleven different enhancement methods in three metrics as follows:

  1. (a)

    Clarity—ability to discriminate structures and details (1 = most clear, 6 = least clear).

  2. (b)

    Accuracy—correctness at which the enhanced image shows structures and details at an indiscernible level (1 = most accurate, 6 = least accurate).

  3. (c)

    Usage—proclivity usage of the algorithm in clinical routine (1 = most preferred, 6 = least preferred).

To effectively perform the MOS analysis, the criteria for the experts includes: (a) must be a clinician, (b) ≥ 7 years clinicians experience, (c) free from any eye defect, and (d) must be an independent expert. Each expert was concurrently shown the images (without prior knowledge) in an irregular order, each of the eleven methods generated enhanced images, and then asked to rate using the MOS metrics in turn. Furthermore, the experts were not allowed to know or see in advance the results of any models or method, as this will create biasness in the evaluation process. In addition, each expert visualizes and grade each image independently without any outsider interferences. In Figs. 5 and 6, we show the results of the three independent experts using the three qualitative metrics for the four medical image modalities. In addition, we investigate the potential variation in terms of preferences between the independent experts for any images, by computing the inter-observer agreement using the Cohen’s Kappa Score [37] as depicted in Table 3.

Table 1 Mean of the measure of enhancement (EME), peak signal to noise ratio (PSNR), structural similarity ratio (SSIM), correlation (\(\rho )\), and mean squared errors (MSE) for 100 images each for X-ray, CT, OCTA, and FA image modalities from patients with different retinal diseases
Table 2 Mean of the measure of enhancement (EME), peak signal to noise ratio (PSNR), structural similarity ratio (SSIM), correlation (\(\rho )\), and mean squared errors (MSE) for 100 images each for X-ray, CT, OCTA, and FA image modalities from patients with different retinal diseases
Table 3 Inter-observer agreement as measured by Cohen’s κ Score among the three expert graders for each of the qualitative metrics for the 100 images each for X-ray, CT, OCTA, and FA image modalities

Based on Fig. 3, we observed that the three state-of-the-art medical image enhancement methods perform smoothing operation adequately on the images. Ying et al. [17] and Li et al. [16] also perform adequately on all the medical image modalities. For the X-ray image modality, Anantrasirichai et al. [14] method perform better than Liu et al. [15], Chitchian et al. [13], and Li et al. [16] methods. Ying et al. [17] performed the best among the state-of-the-art methods. Ying and Anantrasirichai methods result show noticeable improvement in the image quality and preserve to some extend the image features as compared with Liu’s, Chitchian’s, and Li’s methods. As it can be observed, the proposed method produces the best results. For the CT image modality, Anantrasirichai’s method performed better than all the state-of-the-art methods. Liu’s method produced an over-smoothen image, which do not preserve the needed features adequately. Chitchian’s method produced blurry image with the needed features destroyed. Most of the micro-level and structures in the image were destroyed. Ying’s method performed fairly, while Li’s method produced an over-enhanced image. Also, the proposed method performed better than all the state-of-the-art methods. For the OCTA image modality, Anantrasirichai’s method produces better results than the four other methods. Liu’s method performed better than Chitchian’s method. Also, Chitchian’s method performed better than Ying’s and Li’s methods. Li’s method produces better results than Ying’s method. The proposed method also performed better than all the state-of-the-art methods in this image modality. Based on the FFA image modality results, Ying’s method performed better than Anantrasirichai’s, Chitchian’s, Liu’s, and Li’s methods. Ying’s method preserved to high extent the features and structures of the image. Anantrasirichai’s method also preserved the features and structures but failed to adequately enhance the image. Chitchian’s and Liu’s methods produced fairly enhanced image with features and structures well preserved. It is evidence that the results of the proposed method are better than the state-of-the-art methods in the FFA image modality.

In Fig. 4, we compared the proposed LTF-NSI with the currently published works namely, MedGA [38], DPSF [39], Zohair et al. [40], and two well-established conventional methods such as HE [2], and LHE [6]. We observed that for the X-ray and CT imaging modalities, HE technique performed better than all the currently published works MedGA [38], DPSF [39], Zohair et al. [40], and LHE [6]. For the OCTA imaging modality, DPSF [39] performed best and MedGA [38] outperformed all the other techniques (DPSF, Zohair, HE, and LHE) based on FA imaging modality. Even though HE outperformed other techniques based on X-ray and CT imaging modalities, we observed the presence of artifacts on the surface of both HE and DPSF in the results of the X-ray images. Furthermore, some features of the CT imaging modality were enhanced while some were not in the results of the HE technique. DPSF, MedGA, and Zohair’s method adequately enhanced all the needed features but not sufficient to outperformed HE technique based on the CT imaging modality. OCTA imaging modality posses’ difficulty for these enhancement techniques due to the structures and textural arrangement of the tissues. This imaging technique (OCTA) is sensitive to transformation as this often results in over-enhancement as it can be seen in the results of the HE technique. In addition, only MedGA and DPSF techniques were adequate to enhanced FA images efficiently and accurately. LHE technique performed the least in all of the imaging technique as compared with MedGA[38], DPSF[39], Zohair et al. [40], and HE [2]. The proposed algorithm performed better than these techniques (MedGA[38], DPSF[39], Zohair et al. [40], HE [2], and LHE [6]).

Fig. 4
figure 4

Comparative results of the proposed algorithm with the current state-of-the-art and well-established conventional methods. a X-ray image, b CT image, c OCTA, and d FA image

In general, we compare the proposed LTF-NSI with ten image enhancement techniques, that consist of eight state-of-the-art techniques (Chitchian et al. [13], Anantrasirichai et al. [14], Liu et al. [15], Li et al. [16], Ying et al. [17], MedGA [38], DPSF [39], and Zohair et al. [40]) and two well-established conventional techniques (HE [2] and LHE [6]). Among these techniques, HE outperformed other techniques in X-ray, followed by DPSF, Zohair, and MedGA, while other techniques follow. Moreover, HE additionally performed the best based on the CT imaging modality, followed by the MedGA, Zohair, DPSF, and other techniques. DPSF performed better than other state-of-the-art techniques by retaining the image features and structures effectively. MedGA performed better than both HE and Zohair, and Zohair performed better than other techniques based on the OCTA imaging modality. MedGA and DPSF outperformed all other techniques based on the FA imaging modality. Both quantitative and qualitative assessment show that the proposed LTF-NSI performed better than all the ten techniques.

Based on the experimental results, each of the medical imaging devices generated images are of varying quality, contrast, noise, etc., and as a result, each of these devices possesses different challenges for the enhancement algorithms. These challenges determine the algorithm performance in terms of efficiency and computational cost. The proposed LTF-NSI algorithm produces the best results, in terms of enhancement, textural detail, boundary preservation, disease visibility, etc., without artifacts or unwanted dots on the image. The features, edge and structures were distinctively visible with high degree of accuracy, as produced by the proposed method in all the four medical imaging modalities. In addition, the disease boundaries, which is an important indicator of volume are preserved, and are clearly visible without blurring.

Moreover, we observed that the state-of-the-art and conventional methods do not adequately preserve the features, edge, details, micro-level structures of the image. These properties play an important role in disease detection and quantification but were all lost in the state-of-the-art and conventional methods. Even though, an enhancement improvement is observed in the contrast of the image in the state-of-the-art methods, but such improvement is at the cost of blurry edges, faded boundaries, features and useful details loss. However, the proposed method results show to be better than all the eight state-of-the-art and two conventional methods, especially in clear edges, visible features, sharpness, distinction of diseases, and detail visibility in all the four medical image modalities.

The quantitative results of the eight state-of-the-art and two conventional methods and the proposed LTF-NSI is presented in Tables 1 and 2. Here, the presented values for the EME, PSNR, SSIM, ρ, and MSE are average for the 100 images each for the CT, X-ray, FFA, and OCTA image modalities. For each of the evaluation metrics, the proposed method obtained the best results, showing again its efficiency over the state-of-the-art and conventional methods in terms of quantitative evaluation in all the four medical images.

modalities. Additionally, we performed a paired t test for the proposed method with statistical significance value of p < 0.05. The results of the pairwise comparisons of the EME, PSNR, SSIM, \(\rho \), and MSE show statistical significance (p < 0.0003). The proposed LTF-NSI and the eight state-of-the-art and two conventional methods were implemented in MATLAB R2013a and tested on a PC with an Intel Core i5-4200U CPU at 1.60 GHz with 8 GB of RAM. More so, the averaged times to process the 100 images each for the CT, X-ray, FA, and OCTA image modalities are presented in Tables 1 and 2. The results show that the computational cost of the proposed framework outperformed the state-of-the-art and conventional methods in all the four medical image modalities, with Ying et al. [17] and DPSF[39] slightly comparable with the proposed algorithm in terms of speed of usage.

As shown in Figs. 5 and 6, the results obtained by the three expert graders show that the enhanced images generated by the proposed method are highly ranked in all the three-evaluation metrics (clarity, accuracy, and clinical preference). From the results of the expert graders, the proposed method obtained: (1) for the X-ray image modality, no less than 98% for clarity, 99% for accuracy, and 98% for clinical preference; (2) for the CT image modality, no less than 100% for clarity, 100% for accuracy, and 100% for clinical preference; (3) for the OCTA image modality, no less than 100% for clarity, 100% for accuracy, and 100% for clinical preference; and (4) for the FFA image modality, no less than 100% for clarity, for the X-ray image modality, the features, diseases, and micro-level structure in the images in 98% of the images were clearest, and no fewer than 2% were very clear and distinct. This also holds true for accuracy and clinical preference. As compared with the state-of-the-art and conventional methods, the proposed algorithm obtained the highest ranking by the three expert graders in all the three metrics on all the four medical image modalities. Moreover, the inter-observer agreement evidence nearly perfect agreement between all the expert graders on all the three-evaluation metrics as shown in Table 3. These high κ scores for comparisons among all the expert graders indicate small variability among them. Since the range of κ scores of 0.61 to 0.80 signifies substantial agreement, 0.81 to 0.99 indicates as nearly perfect agreement, and 1 represents perfect agreement [37].

Fig. 5
figure 5

Average qualitative results for perceived clarity, accuracy, and preference (clinical usage) for 100 images each for X-ray, CT, OCTA, and FA image modalities ranked by three observers. The values indicate the percentage of images in each rank position (1 = highest rank, 6 = lowest rank) using Ying, Li, Chitchian, AWBF, Liu, and the proposed LTF-NSI

Fig. 6
figure 6

Average qualitative results for perceived clarity, accuracy, and preference (clinical usage) for 100 images each for X-ray, CT, OCTA, and FA image modalities ranked by three observers. The values indicate the percentage of images in each rank position (1 = highest rank, 6 = lowest rank) using MedGA. DPSF, Zohair, HE, LHE, and the proposed LTF-NSI

Conclusion

In this work, we developed a robust and significantly fast multimodal and multi-vendor enhancement algorithm for medical images based on a novel local transfer function using a novel neighborhood similarity index. The proposed algorithm utilized the optimization algorithm for optimal parameter selection, which can be implemented in real time for clinical usage. Based on the experimentation results, the performance of the proposed algorithm shows its effectiveness and efficiency to the state-of-the-art and conventional methods. In Fig. 7, we present a ROI image to show the enhancement effect of each technique. Due to the size of the images, it is difficult for any observer to ideally evaluate the images, unless it is enlarged. Without enlarging the images, DPSF and HE techniques show adequate enhancement, then followed by the proposed method. Both DPSF and HE techniques suffer from excessive enhancement and unwanted artifacts, which is not found in the result of the proposed method. In addition, upon enlargement (ROI, red rectangular frame) we observed that the micro and macro structures that are needed were lost, and necessary disease regions (black) have been reduced drastically which is caused by over enhancement (turn white) in both DPSF and HE techniques. Enlargement of medical images is a crucial step for clinicians in observing micro/macro changes in the structure of any tissues, bones, or body part. Furthermore, the right-hand side structures in the ROI region (Fig. 7) were lost completely (over-enhanced) in both the DPSF and HE techniques but were adequately preserved in the proposed method. These are the challenges currently faced by the medical image enhancement algorithms but were overcome by the proposed algorithm.

Fig. 7
figure 7

Comparative results of the proposed algorithm with the current state-of-the-art and well-established conventional methods based on X-ray images

We believed that the proposed algorithm is useful in medical feature enhancement [41], provide better input for other automated image processing techniques [42, 43] and disease monitoring [44]. In addition, we anticipate that the proposed LTF-NSI will be a powerful tool for clinical diagnosis and disease monitoring. The shortcoming of the proposed algorithm is the computational complexity which required further future optimization for efficient clinical usage.