1 Introduction

Image segmentation is a critical step in separating a digital image into homogeneous parts or classes with comparable features such as color, texture and intensity [1]. The purpose of segmentation is to give a simple model of an image that has a strong meaning and can be studied more easily. However, image segmentation can give a label to each pixel in the image, and these pixels with the same label have the same characteristics. Image segmentation can determine boundaries and objects in the image, such as curves or lines. The output of image segmentation is a set of segmented images that can cover the entire image. In the last few years, there has been a growing interest in image segmentation to solve many problems in the medical field such as tumor detection and segmentation [2], cell segmentation in histopathology [3], blood vessel segmentation [4], and more. Colour image segmentation is a kind of image segmentation. This type of segmentation can split the colored image into connected regions and homogeneous ones depending on texture, their combination, and the color of the image [5]. Many segmentation techniques are available in the literature, such as thresholding-based, artificial neural networks-based, edge-based, region-based, watershed-based, swarm intelligence-based, or clustering-based segmentation algorithms, and the last technique in image segmentation is the Partial Differential Equation (PDE) based method. Figure 1 shows techniques for image segmentation.

Fig. 1
figure 1

Methods for Image Segmentation

1.1 Thresholding Based Method

The thresholding method is a popular image segmentation technique that is used to split the pixels in an image according to their intensity level [6, 7]. However, the thresholding method can be classified into three categories: multi-level, bi-level, and local thresholding. For a gray-scale image, bilevel thresholding was utilized for separating an image into two classes, i.e. objects and background. Multi-level thresholding, on the other hand, can be used to divide complex images into more than two homogeneous classes/regions. The main goal of bi-level and multi-level thresholding is determining the optimal threshold values. The major defect of thresholding-based techniques is computationally expensive when increasing the number of threshold values. Each region of the image is assigned a different threshold value using the local thresholding method. The values of the threshold are heavily influenced by the pixel’s surrounding information. Furthermore, local thresholding cannot obtain accurate results for some image segmentation problems [8]. A new method was proposed in [9] using the Black Widow Spider Optimization Algorithm to determine the optimal thresholds in the Otsu method for image segmentation. The experimental results demonstrate that the proposed algorithm outperforms other optimization algorithms in terms of finding better thresholds. The method shows promising performance in terms of the PSNR and SSIM indexes, particularly in thresholding medical images such as brain tumors and optic disc detection in retinal images. In [10], authors employed the Otsu multi-thresholding technique with the Sine Cosine algorithm (SCA) to detect lung cancer diseases using CT images. The algorithm’s results were evaluated using a set of measuring metrics such as PSNR, SSIM, and calculation time. The evaluation revealed that the proposed approach outperformed other methods in terms of performance. A new algorithm called HVSFLA in [11] is improved for multi-threshold image segmentation, utilizing an ensemble multi-strategy-driven shuffled frog leaping algorithm with horizontal and vertical crossover search. The horizontal crossover enables information exchange and exploration among different frogs, while the vertical crossover encourages active search for stagnant frogs. The algorithm is compared to state-of-the-art methods on benchmark functions and BSDS500 datasets, demonstrating superior performance. Applied to breast-invasive ductal carcinoma cases, HVSFLA outperforms competitors, making it a promising option for medical image segmentation.

1.2 Artificial Neural Network-Based method

The Artificial Neural Network-Based (ANN) segmentation method is used to learn the regions of interest in a given image by a group of neurons to recognize foreground objects [12]. There are three stages in using neural network methods in image segmentation: feature extraction stage, ANN training utilizing feature sets, and testing. However, the major defect of the neural network-based segmentation method is computationally expensive and depends on training data. The authors in [13] seek to develop an approach for the detection and segmentation of kidney diseases using clustering and classification techniques. Therefore, they employ an artificial neural network for stone detection and a multi-kernel k-means clustering algorithm for segmentation. This method consists of preprocessing, feature extraction, classification, and segmentation modules that involve noise removal, GLCM feature extraction, neural network classification, and separate segmentation of stones and tumours. The evaluation revealed that the proposed approach outperformed other methods in terms of performance. In [14], the authors introduce Enhanced Convolutional Neural Networks (ECNN) with loss function optimization using the BAT algorithm for MRI image segmentation. The method utilizes small kernels in a deep architecture to mitigate overfitting. Pre-processing involves skull stripping and image enhancement algorithms. The results of the algorithm demonstrate improved performance compared to existing methods.

1.3 Edge-based method

The edge-based method converts a given image to an edge image utilizing the changes in grayscale in the image. Therefore, edge-based methods can be considered robust segmentation methods [5]. Edges are a sign of a lack of continuity and ending. Objects consist of multiple parts of various color levels, and the obtained boundary defines the edges of the expected object. As a result, the object can be segmented from the image by detecting its edges. Many operators are used to define the edge of the objects, such as the Sobel operator, the Prewitt operator, the Wallis operator, the Laplacian, the Kirsch operator, and the canny operator. Consequently, the edge detection method has been applied in many image processing applications, such as the detection of blood vessels [15] and the detection of brain tumor [16]. The authors in [17] enhance the active contour (AC) by optimizing the initial contour through a genetic algorithm (GA), improving the performance in detecting skin lesion boundaries. The proposed optimized initial active contour is evaluated against graph-cut and k-means methods using diverse evaluation metrics in the segmentation of dermoscopic images.

1.4 Region-based method

The fundamental idea behind the region-based method is a typical standard serial region segmentation algorithm, and its main idea is to group pixels with similar features into regions [18, 19]. The procedure involves first choosing a seed pixel and then merging the neighboring similar pixels into the area where the seed pixel is situated. Only a few seed points are needed to fulfil the concept of region-based segmentation. In addition, it can select several segments of regions simultaneously when selecting seed pixels. In noisy images, region-based methods are often superior in segmentation (where borders are difficult to detect) [20]. In [21] a novel approach called SiWOA-FUSION is proposed to tackle brain diseases. The method utilizes optimal region-based segmentation with the Sine function-adapted improved Whale Optimization Algorithm (SiWOA). Segmented regions and discrete wavelet coefficients are fused using interval type-2 fuzzy rules, and the visual quality of the resulting image is optimized using SiWOA with the amount of correlation of differences (ACD) as the objective function. The experimental results validate the superior performance of the proposed method compared to existing approaches when evaluated on standard benchmark databases. In [22], the authors present a specialized method for filtering Computed Tomography (CT) images obtained with lower X-ray dosage. The introduced technique modifies the conventional amoeba filtering approach by adjusting the pilot image formation and amoeba shaping mechanism. It utilizes a pilot image based on the Wiener filter and region-based segmentation for shaping the amoeba, resulting in benefits such as compatibility with CT images due to their similarity to human body anatomy, effective image smoothing without compromising edge details, and adaptability and resilience to noise. Encouraging results highlight the effectiveness of the proposed approach. In [23], a novel method for segmenting brain magnetic resonance imaging is introduced. It combines a region-based active contour model and an evolutionary algorithm to achieve precise edge extraction, handle intensity variation, and reduce noise. The proposed model incorporates a local similarity factor based on the Bilateral filter principle (LSFB) and employs a multi-population genetic algorithm (MPGA) to optimize energy parameters, resulting in enhanced search performance and solution quality compared to single-population models.

1.5 Watershed-based method

These methods were designed using the idea of topological interpretation, where the intensity is represented by basins with holes in their minima where water drips [24]. The adjacent basins are then joined when the water reaches the basin’s edge. In watershed-based approaches, the image gradient is seen as a topographic surface. The more gradient-filled pixels are called continuous borders. The main goal of the watershed-based method is that the obtained segments are more stable and that the detected boundaries are distinct [25], but the gradient computation is complex. The authors in [26] presented a new method for segmenting the liver in abdominal CT images, utilizing marker-based watershed transforms. In the pre-processing stage, a modified double-stage Gaussian filter (MDSGF) is applied to enhance contrast and retain edge and texture information in the liver CT images. The experimental results demonstrate the superior performance of the proposed approach compared to other competitive algorithms.

1.6 Clustering-based method

The concept of the cluster segmentation method is a process of collecting a large group of data into clusters with high intra-cluster and low inter-cluster similarity [27]. These clustering techniques also divide the image into many groups with related traits. Hard clustering and soft clustering are two fundamental categories of approaches. One pixel can only belong to one cluster when using hard clustering. As a result, there is a binary membership (i.e. 0 or 1) among elements and clusters. This method computes all cluster centers before allocating the elements to the closest cluster. One of the most significant applications of hard clustering is K-means [28]. However, soft clustering is the opposite of hard clustering in the use of fractional membership, making it more practical for use in practical applications. A novel method is introduced in [29] to achieve optimal lesion segmentation by integrating swarm intelligence-based algorithms with traditional Fuzzy C-means (FCMs) and K-means clustering algorithms. The objective is to improve global convergence and minimize the risk of getting stuck in local optima. In [30], a new algorithm is proposed for medical image segmentation that combines density peak clustering (DPC) with the fruit fly optimization algorithm to detect diseases early. Through experiments conducted on benchmark datasets and proprietary datasets, the algorithm demonstrates adaptive segmentation capabilities with faster convergence and improved robustness. The authors in [31] introduced a hybrid approach for COVID-19 screening using chest CT images, utilizing Hybrid Particle Swarm Optimized and Fuzzy C Means Clustering. The proposed segmentation method was evaluated on 15 chest CT images of COVID-19-infected patients. Quantitative validation using metrics such as entropy, contrast, and standard deviation demonstrated the superior performance of the proposed approach compared to conventional Fuzzy C Means Clustering, indicating its effectiveness for efficient COVID-19 screening.

1.7 Partial Differential Equation (PDE) based method

Most of the literature considers that the fastest technique for image segmentation tasks is the PDE-based method, which makes them suitable for use in real-time applications [32]. There are two types of PDE techniques commonly utilized the convex non-quadratic variation, which is used to remove the image’s noise, and the nonlinear isotropic diffusion filter, which is used to improve the image’s edges. In [33], an improved partial differential equation (PDE)-based total variation (TV) model is introduced to improve grey and colored brain tumor images captured by magnetic resonance imaging. The experimental results demonstrate the superior performance of the proposed method compared to other competing algorithms. To address microscopic image noise in the medical field, a segmentation model utilizing fourth-order partial differential equation smoothing is proposed in [34]. By incorporating the directional curvature mode value to describe image smoothness, a fourth-order PDE image denoising model was derived. This model effectively reduces noise while preserving edges, demonstrating stability, robust contour extraction, and fast convergence.

For image segmentation, many image segmentation techniques [35] have been proposed, and these techniques use traditional methods, swarm intelligence techniques, and metaheuristics. For example, researchers executed segmentation in [36] utilizing wind-driven optimization and a cuckoo search algorithm for multilevel thresholding based on Kapur’s entropy as an objective function. The researchers in [37] proposed a novel version of the manta ray foraging optimization algorithm to solve the image segmentation problem using multilevel thresholding using opposition-based learning and COVID-19 CT images. In [38], the graph partitioning scheme based on swarm intelligence for image segmentation is proposed using particle swarm optimization to segment images.

1.8 Motivation

This review aims to help researchers use metaheuristic and swarm intelligence methods to improve image segmentation in the medical field. Ultimately, the key contributions of this article can be condensed as follows.

  1. 1.

    Analyze and show state-of-the-art applications based on image segmentation using metaheuristics algorithms.

  2. 2.

    Discuss the image segmentation techniques used with metaheuristics algorithms and swarm intelligence techniques.

  3. 3.

    Analyze the metrics recently used for segmentation performance estimation.

  4. 4.

    Discuss the publicly available image databases used for image segmentation.

This paper provides an overview of different NIOS image segmentation techniques in the medical field to detect and treat human diseases. The remainder of this paper is organized as follows. The basics and background of nature-inspired optimization algorithms are discussed in Sect. 2. An overview of medical imaging modalities is explained in Sect. 4. Image segmentation using NIOS is presented in Sect. 5. In Sect. 6, challenges and future trends are discussed. The conclusion is explained in Sect. 7.

2 Nature-inspired optimization algorithms overview

The nature-inspired optimization algorithms (NIOAs) [39,40,41] provide rules and operators that can be applied to real problems to solve complex optimization problems where conventional mathematical techniques are not suitable. NIOAs are described as an alternative method that is used to solve a specific problem by intelligently combining various concepts to explore and exploit the search space. They are inspired by simulations of natural phenomena. NIOAs can be classified into metaphor and non-metaphor nature-inspired optimization algorithms. The metaphor optimization algorithms are algorithms that mimic natural phenomena, mathematical operations, and the behavior of humans in real life, such as biology-based algorithms, chemistry-based algorithms, music-based algorithms, math-based algorithms, physics-based algorithms, and human-based algorithms. However, the non-metaphor optimization algorithms did not utilize emulation to determine their search methods, such as Tabu Search (TS) and Variable Neighborhood Search (VNS). Figure 2a represents the statistical studies of image segmentation using NIOAs to treat human diseases conducted from 2012 to 2022 based on information from Scopus databases. Figure 2b illustrates the distribution of image segmentation with NIOAs for the human disease research area. The following subsections discuss types of nature-inspired optimization algorithms in detail.

Fig. 2
figure 2

The publications of image segmentation using NIOAs to treat human diseases performed in the last decade [2012-2022] as announced by the Scopus database

2.1 Metaphor optimization algorithms

This subsection explains the first type of nature-inspired algorithm that emulates mathematical operations, natural phenomena, and the behavior of humans and organisms in real life. The types of metaphor optimization algorithms are explained as follows:

2.1.1 Biology-based algorithms

These algorithms are concerned with the emulation of different biological metaphors in nature (components, structure, etc.). In addition, they are classified into three main categories: swarm intelligence, evolutionary algorithms, and artificial immune systems. Evolutionary algorithms (EAs) are algorithms that mimic the biological phenomena of evolution using crossover, selection, reproduction, and mutation operators to generate optimal solutions, such as evolutionary strategies [42], evolutionary programming [43], genetic programming [44], and genetic algorithms [45]. Swarm intelligence algorithms emulate individuals’ natural behaviors in groups, such as insects and birds. Swarm intelligence relies primarily on the decentralization concept, in which search agents upgrade their location based on their interaction with other agents and their environment. The most popular algorithms in swarm intelligence are Ant Colony Optimization [46], Particle Swarm Optimization [47], Harris Hawks Optimization [48], Ant Lion Optimizer [49], and Whale Optimization Algorithm [50]. Artificial Immune Systems (AIS) [51] are inspired by theoretical immunology and observed immune models, principles, and functions. AIS can be considered as another algorithmic variation of evolutionary algorithms [52]. Furthermore, most AIS-based nature-inspired algorithms depend on the clonal selection principles such as the clonal selection Algorithm [53], negative selection Algorithms [54], and the B-cell Algorithm [55]. Therefore, AIS and SI could be classified as a subcategory of evolutionary algorithms.

2.1.2 Chemistry-based algorithms

Chemistry-based algorithms are inspired by reactions and experiments with chemicals in nature. These algorithms are classified into two types; Chemical Reaction Optimization and Gases Brownian Motion Optimization. First, chemical reaction optimization (CRO) [56] mimics the behavior of the interaction of the molecules within the chemical reactions to reach a low-energy stable state. The molecule in CRO represents a candidate solution and each molecule has two kinds of energy: Kinetic energy (KE) (KE) and Potential Energy (PE). Researchers used these two energy types to prove the superiority of CRO over other competitive algorithms. Secondly, the Gases Brownian motion optimization (GBMO) [57] algorithm takes its inspiration from the motion of suspended fluid particles (Brownian motion and turbulent rotational). Each candidate solution in GBMO is presented by a molecule that has a set of criteria such as velocity, mass, turbulent radius, and position. According to Brownian motion, the molecules move toward the goal and are estimated on the basis of their location. The temperature of the environment in GBMO is a predominant parameter for controlling the velocity of molecules and trade-off between diversification and intensification and kinetic energy, so high temperature increases the velocity of molecules and kinetic energy and increases exploration, whereas low temperature reduces the random Brownian motion and increases exploitation. Furthermore, the researchers proposed a novel computational approach inspired by chemical reaction reactions in nature, and this approach is named Artificial Chemical Reaction Optimization Algorithm (ACROA) [58].

2.1.3 Music-based algorithms

Music-based algorithms are inspired by natural music when musicians look for better harmonies and emulate the musician’s improvisation process. The most popular algorithms in music-based algorithms are Harmony Search (HS) and Musical Composition Method (MMC). Harmony Search (HS) is proposed by Geem et al. [59], which emulates the music improvisation process of musicians. Each candidate solution in HS is represented by a vector. When all the values of a solution vector have a high fitness (good harmony), this experiment is saved in each variable memory, and the likelihood of making a good solution increases the next time. Researchers used an Energy Curve instead of a histogram with the Harmony Search Algorithm [60] and the Otsu strategy to calculate optimized gray levels. Method of Musical Composition (MMC) [61, 62] is an imitation of a natural musical society in which musicians exchange information with each other and with their community to produce a musical composition. The main idea of MMC, as with HS, is to produce good musical artwork from the composer’s tunes. MMC, on the other hand, is not a variant of HS because it adheres to a social system that includes exchange and learning methodologies. Each candidate solution in MMC is represented by a composer tune, which is an n-dimensional vector of decision variables. During the initialization phase, each composer creates a set of random tunes that are saved in a scoring matrix such as HS. After that, the information exchange is executed according to the interaction rules.

2.1.4 Physics-based algorithms

Physics-based algorithms are inspired by the laws of physics that operate in nature behind different phenomena to address real-world problems in optimization. Henry gas solubility optimization (HGSO) [63] is one of the main algorithms in this category. HGSO algorithm is inspired by Henry’s law for addressing difficult optimization problems. Additionally, HGSO emulates the huddling behavior of gas to balance between the exploitation and exploration phases in the search space and avoid local optima. The nuclear reaction optimization (NRO) algorithm [64] emulates the nuclear reaction process, and this algorithm consists of two steps; the nuclear fusion (NFu) step and the nuclear fission (NFi) step to address engineering design optimization problems. The gravitational search algorithm (GSA) simulates Newton’s laws of gravity and motion. Each candidate solution in GSA is considered an object that has inertial mass, passive gravitational mass, active gravitational mass, and position. In GSA, the position of the objects represents the solution and their fitness is estimated by their masses.

2.1.5 Human-based algorithms

Human-based algorithms mimic the social behavior of humans. The main algorithms in this category are the Heap Based Optimizer (HBO) [65], Teaching-Learning-Based Optimization (TLBO) [66], and League Championship Algorithm (LCA) [67]. HBO mimics the behavior of humans based on the corporate rank hierarchy (CRH) to solve optimization problems. The mathematical model of HBO depends on three pillars: the relationship among coworkers, the relationship between subordinates and their immediate supervisor, and the self-contribution of employees. The TLBO algorithm is inspired by the teaching-learning process based on the effect of a teacher’s influence on the output of learners in a class. The TLBO has two main modes of learning teaching and learning through interaction with other learners. The researchers improved TLBO and called DI-TLBO [68], which is applied to address multilevel threshold problems in image segmentation and is applied with Kapur’s entropy function and Otsu’s between class variance function as two objective functions. LCA is inspired by the sports championship in which sports teams play in a league for several weeks and over many seasons. The match is then analyzed using opportunities, strengths, threats (SWOT), and weaknesses analysis. The winning team is defined by the strength of playing (fitness value) and the good formulation of the team.

2.2 Non-metaphor optimization algorithms

This subsection explains the second type of nature-inspired algorithm called non-metaphor. Non-metaphor optimization algorithms are a type of nature-inspired algorithms inspired by natural processes and phenomena for solving complex optimization problems. Unlike traditional metaphor optimization algorithms that utilize analogies to natural systems, non-metaphor optimization algorithms directly incorporate mathematical models of natural systems into the optimization algorithm. Non-metaphor optimization algorithms can be categorized into several types, such as Tabu Search algorithm (TS) [69], and Partial Optimization Metaheuristic Under Special Intensification Conditions (POPMUSIC) [70].

These algorithms mimic the behavior of natural systems to find the best solution to the problem being solved. Tabu Search algorithm (TS) is designed by McMillan and Glover and is based on the premise that intelligent problem solving must include adaptive memory and responsive exploration to solve optimization problems. POPMUSIC is a type of popular Particle Swarm Optimization (PSO) algorithm that uses a swarm of particles to search for the optimal solution. POPMUSIC improves the PSO algorithm by introducing partial optimization and special intensification conditions. Partial optimization involves dividing the search space into smaller sub-spaces and applying different optimization techniques to each sub-space. This allows POPMUSIC to better explore the search space and avoid being trapped in local optima. However, responsive exploration and adaptive memory are two main features of TS.

3 Performance evaluation metrics of image segmentation

Image segmentation evaluation is a critical phase that estimates whether the segmentation process was performed accurately or not. Many techniques are used to evaluate the quality and performance of image segmentation. There are five evaluation metrics for image segmentation, such as mean square error (MSE), peak signal-to-noise ratio (PSNR), feature similarity index measure (FSIM), structural similarity index measure (SSIM), and quality index based on local variation (QILV).

3.1 Mean Squared Error (MSE)

MSE is an evaluation model in image segmentation to assess the difference between a segmented image and an original image [71]. The MSE formula is performed by calculating the average squared difference between the intensity values of the segmented image and the original image by the following:

$$\begin{aligned} \begin{aligned} MSE=\frac{1}{xy} \sum _{x=0}^{x-1} \sum _{y=0}^{y-1} (I(u,v)-Seg(u,v))^2 \end{aligned} \end{aligned}$$
(1)

x and y are the dimensions of an image, the original image is I, and Seg represents the segmented image. The minimum MSE values mean the minimum difference between the original and segmented image. Thus, the lowest MSE gives better segmentation. Therefore, the MSE equals zero when the intensity value of the pixel in the original image is equal to the intensity value of a pixel in the segmented image with the same position.

3.2 Peak Signal to Noise Ratio (PSNR)

PSNR [72] is an evaluation metric used to estimate the quality of the segmentation by finding the difference between the quality of the original and the segmented image. Therefore, PSNR uses the Root Mean Square Error (RMSE) to compare the original and segmented images. The formula of PSNR can be described by the following:

$$\begin{aligned} \begin{aligned} PSNR=20\;log_{10} \dfrac{255}{RMSE} \end{aligned} \end{aligned}$$
(2)

where the RMSE can be defined by the following:

$$\begin{aligned} \begin{aligned} RMSE = \sqrt{\dfrac{\sum _{u=1}^x \sum _{v=1}^y \left( \left( I(u,v)-Seg(u,v)\right) ^2\right) }{x \times y}} \end{aligned} \end{aligned}$$
(3)

where Seg and I are a segmented image and the original image of size \(x \times y\). Lower PSNR values mean a lower similarity between the original and segmented image. Hence, the higher PSNR better the segmentation.

3.3 Feature Similarity Index Measure (FSIM)

FSIM in [73, 74] is another evaluation metric utilized to estimate the similarities between the original and the segmented image based on its internal features. There are two main criteria, called Gradient magnitude (GM) and Phase congruency (PC). FSIM utilized these criteria to detect image features and calculate the image gradient. Higher FSIM values mean a greater similarity between the original and segmented image. Hence, the lower FSIM is the worst segmentation. The following steps represent the formula of FSIM.

$$\begin{aligned} \begin{aligned} FSIM= \frac{ \sum _{u\epsilon \Omega }S_L(u)PC_{n}(u)}{\sum _{u\epsilon \Omega }PC_{n}(u)} \end{aligned} \end{aligned}$$
(4)

\(\omega\) is the entire image domain:

$$\begin{aligned} \begin{aligned} S_L(u)=S_{PC}(u)S_G(u) \end{aligned} \end{aligned}$$
(5)
$$\begin{aligned} \begin{aligned} S_{PC}(u)= \frac{2PC_1(u)PC_2(u)+T_1}{PC_1^2(u)+PC_2^2(u)+T_1} \end{aligned} \end{aligned}$$
(6)
$$\begin{aligned} \begin{aligned} S_G(u)= \frac{2G_1(u)G_2(u)+T_1}{G_1^2(u)+G_2^2(u)+T_1}, \end{aligned} \end{aligned}$$
(7)

\(T_1\) is a positive constant which increases \(S_PC\),the stability of the image’s gradient magnitude is G and can be calculated as follows:

$$\begin{aligned} \begin{aligned} G=\sqrt{G_y^2+G_x^2} \end{aligned} \end{aligned}$$
(8)
$$\begin{aligned} \begin{aligned} PC(u)= \frac{E(u)}{\left( \epsilon + \sum _{n}A_n(u) \right) } \end{aligned} \end{aligned}$$
(9)

E(u) is the magnitude vector in u on n, and \(A_n(u)\) is the local amplitude of scale n. \(\epsilon\) is the small positive number, and \(PC_m(u) = max(PC_1(u),PC2(u))\).

3.4 Structural Similarity Index Measure (SSIM)

SSIM [75] is an evaluation metric that is used to analyze internal structures in the segmentation image. Higher SSIM values mean greater similarity between two images (original and segmented images). Therefore, better segmentation means higher SSIM. The following equation represents the formula of SSIM.

$$\begin{aligned} \begin{aligned} SSIM(I,Seg)= \dfrac{\left( 2\mu _1 \mu _{Seg} + c_a\right) \left( 2\sigma _{1,Seg} +c_b \right) }{\left( \mu _I^2 + \mu _{Seg}^2 + c_a\right) \left( \sigma _I^1 + \sigma _{Seg}^2 +c_b\right) } \end{aligned} \end{aligned}$$
(10)

\(\mu _I\) and \(\mu _{Seg}\) are the mean of the intensity values of the original and segmented images, respectively. The standard deviation values of the original and segmented images are \(\sigma _{I}\) and \(\sigma _{Seg}\), respectively. Additionally, the covariance of the original and segmented images is \(\sigma _{I, Seg}\), \(c_a\) and \(c_b\) are two constant numbers.

3.5 Quality Index Based on Local Variance (QILV)

QILV index is another evaluation metric proposed in [76] to measure the quality of segmented images. The authors used QILV to compare the Golden Standard (GS) with the local variance distribution of the image. The following equation describes the local variance of the image:

$$\begin{aligned} \begin{aligned} Var=(I_{u,v})=E\left( \left( I_{u,v}-\overline{\textrm{I}_{u,v}}\right) ^2\right) \end{aligned} \end{aligned}$$
(11)

the weighted neighborhood pixel is \(\eta _{u,v}\) and utilized to evaluate \(\overline{\textrm{I}_{u,v}}=E(I_{u,v})\) with separate weights \(\omega _p\). The following steps are used to compute the local variance:

$$\begin{aligned} \begin{aligned}{} & {} Var=(I_{u,v})=\frac{\sum _{p\in \eta _{u,v}}\omega _p(I_{u,v}-\overline{\textrm{I}_{u,v}})^2}{\sum _{p\in \eta _{u,v}}\omega _p} \; where \; \\{} & {} \quad \overline{\textrm{I}_{u,v}}=\frac{\sum _{p\in \eta _{u,v}}\omega _p I_p}{\sum _{p\in \eta _{u,v}}\omega _p} \end{aligned} \end{aligned}$$
(12)
$$\begin{aligned} \begin{aligned} \mu _{V_I}=\frac{1}{XY} \sum _{u=1}^{X} \sum _{v=0}^{Y} Var(I_{u,v}) \end{aligned} \end{aligned}$$
(13)

where the local variances’ mean is \(\mu _{V_I}\).

$$\begin{aligned} \begin{aligned} \sigma _{V_I}=\left( \frac{1}{XY-1}\sum _{u=1}^{X} \sum _{v=0}^{Y}Var(I_{u,v}-\mu _{V_I})^2\right) ^{\frac{1}{2}} \end{aligned} \end{aligned}$$
(14)

\(\sigma _{V_I}\) describes the local variances’ standard deviation.

$$\begin{aligned} \begin{aligned} \sigma _{V_I\ V_J}=\frac{1}{XY-1}\sum _{u=1}^{X} \sum _{v=0}^{Y}\left( Var(I_{u,v}-\mu _{V_I})\right) \;\\ \left( Var(I_{u,v}-\mu _{V_J})\right) \end{aligned} \end{aligned}$$
(15)

where the covariance among the variances of the original and segmented images is \(\sigma _{V_I\ V_J}\).

Therefore,

$$\begin{aligned} \begin{aligned} QILV(I, J)=\frac{2\mu _{V_I}\mu _{V_J}}{\mu _{V_I}^2+\mu _{V_I}^2} \cdot \frac{2\sigma _{V_I}\sigma _{V_J}}{\sigma _{V_I}^2+\sigma _{V_I}^2}\cdot \frac{\sigma _{V_I}\sigma _{V_J}}{\sigma _{V_I}+\sigma _{V_I}} \end{aligned} \end{aligned}$$
(16)

the difference in the performance between the original and segmented images is represented in QILV(IJ).

4 Medical images modalities and available datasets

This section explains an overview of the types of images and public data sets that are used to segment medical images using nature-inspired optimization algorithms. There are many types of images in the medical field that are used to analyze and diagnose diseases, such as the detection of breast cancer [77, 78], the detection of a brain tumor [79], the diagnosis of lung diseases [80], the cardiovascular treatment [81], and the identification of retinal diseases [82]. There are many datasets used by the authors in this review. Therefore, the two following sections explain in detail the types of medical images and databases.

4.1 Overview of the medical imaging

Quite recently, considerable attention has been paid to medical imaging due to its increasing use in diagnosing clinical, treatment evaluation, and determining faults in various body organs such as the lungs [83], stomach [84], eyes [85], breasts [86], and brain [87]. Many techniques produce different types of medical images to diagnose, treat, and track the body of the human. Therefore, many researchers concentrated their efforts on the production and analysis of medical images in order to diagnose most diseases. Image segmentation and nature-inspired optimization algorithms have played a vital role in the medical field and made significant advances in recent decades in medical image analysis to identify pathological lesions and their treatment. These medical images consist of different sets of images such as ultrasound, magnetic resonance imaging, mammography, thermography, and histological images. The following sub-sections introduce an overview of most of the medical image types. Figure 3 shows the distribution of different imaging modalities used in the studies in this review.

Fig. 3
figure 3

A bar plot of the different imaging modalities used in the studies in the last decade [2012-2022] as announced by the Scopus database

4.1.1 Ultrasound images

Ultrasound imaging is a type of imaging utilized to show the body from the inside through high-frequency sound waves, and it is called sonography. Additionally, this type can show internal movements of body organs and blood flow within blood vessels, unlike X-ray imaging. Researchers used ultrasound imaging to solve many problems in the medical field, such as identifying liver diseases [88] and treating stroke [89].

4.1.2 Magnetic Resonance Imaging (MRI)

MRI is another type of image that produces detailed three-dimensional (3D) anatomical images, used to detect cancer cells in the body, identify diseases that attack the body, and monitor their treatment. A high dose of radiation is needed for an MRI of the human body to produce accurate 3D images. Therefore, when researchers use an MRI [90, 91], the differences in the infected regions become very clear compared to the rest of the body. Several diseases cannot be identified only by MRI, such as the detection of the first type of breast cancer, automated brain tumor segmentation and detection [92], and liver segmentation to identify liver diseases [93].

4.1.3 Mammography images

A mammogram is an x-ray image that is used to examine the breast to detect breast cancer and other breast diseases early and, therefore, needed a small dose of x-rays to generate breast images [94, 95]. The x-ray image shows the breast in variations of white and gray. Therefore, the white color appears clearly when the density of the tissue increases and this tissue contained glands, normal tissue, and benign areas of the breast (noncancerous), but other less dense tissue and fat are gray on a mammogram image.

4.1.4 Thermography images

Thermography is a type of radiography that pictures the temperatures of the body’s surface and is called thermograms. The thermogram is another type of image that appeared in 1982 [96]. These images used infrared rays emanating from the body because infrared rays are produced from all bodies that have temperatures exceeding absolute zero according to the law of black body radiation. Thus, the obtained thermograms are comprised of different shades of gray; the lighter tones refer to hotter regions, but darker tones refer to colder areas in the body. Therefore, many researchers used thermography to solve several problems in the medical field, such as joint diseases, vertebral column diseases, tendon damage, and breast cancer diseases [97, 98].

4.1.5 Histological images

Histopathology images are medical images that researchers use to analyze and diagnose several diseases of humans. These images can be obtained by using specific cameras with a microscope for tissue imaging and by studying the constituents of tissues under a microscope. Histopathological images play a vital role in the investigation of biological structures and the diagnosis of various diseases such as cancer [99, 100]. However, because digital histopathology images have specific properties and tasks, researchers usually require specific processing approaches.

4.1.6 Computer Tomography images

Computer Tomography (CT) imaging is a type of imaging also known as CT scan and is a medical imaging technique utilized to produce intricate visualizations of internal bodily structures. By integrating X-ray images obtained from various perspectives, it forms cross-sectional representations, commonly referred to as "slices," of the body. These visuals offer healthcare professionals a comprehensive view of the body, aiding in the diagnosis and monitoring of a range of conditions impacting diverse organs and tissues [101].

4.2 Overview of the medical datasets

This subsection presents an overview of medical data sets used by segmentation with nature-inspired optimization algorithms to provide the best solutions to medical problems. Researchers use many datasets to analyze and diagnose various human diseases, such as cancer, joint disease, vertebral column disease, liver disease, and kidney disease. Therefore, there are many datasets in the medical field, such as Magnetic Resonance Brain Images (MRBI), COVID-19, Cancer Data Repository (CDR), Internet Brain Segmentation Repository (IBSR), a Digital Database for Screening Mammography (DDSM), Image Retrieval in Medical Applications (IRMA), Mammographic Image Analysis Society (MIAS) / mini-MIAS, Breast Cancer Histopathological Image (BreakHis) and Liver Ultrasound Images (LUI). Table 1 shows the index of the data sets used in this review and is accompanied by URLs.

Table 1 Index of datasets used in this review and accompanied by URLs

5 Image segmentation using nature-inspired optimization algorithms

Image segmentation is considered the first and most fundamental operation to analyze the acquired image in different applications of computer vision, such as medical imaging [115], autonomous target recognition [116], geographic imaging [117], robotic vision [118], etc. In general, image segmentation is defined as the splitting of an image into multiple segment images (foreground and background) based on some features, such as textures or gray-level values. Mandal proposed a strong version of image segmentation in [119] by using particle swarm optimization to achieve satisfactory segmentation performance. Furthermore, researchers used nature-inspired algorithms in image segmentation to solve optimization problems and achieve the optimal solution [120]. In [121], they used the allostatic mechanism as the essential model and made a comparison between the Allostatic Optimization (AO) algorithm and the Artificial Bee Colony (ABC), the Differential Evolution (DE) and Particle Swarm Optimization (PSO) to improve image segmentation. Nebti used several swarm-based methods in [122] for color image segmentation, and these methods are the predator–prey optimizer (PPO), the bee algorithm, and the cooperative co-evolutionary optimizer. The following subsections provide a review of the most popular image segmentation methods and their improvement utilizing nature-inspired algorithms. Figure 4a shows the statistical studies of image segmentation using NIOAs in medicine performed from 2012 to 2022 based on information from Scopus databases. Figure 4b shows a chart of different image segmentation techniques used in the treatment of human diseases discussed in this review.

Fig. 4
figure 4

The publications of image segmentation using NIOAs in medicine performed in the last decade [2012-2022] as announced by the Scopus database

5.1 Thresholding-based image segmentation using NIOAs

The finding of optimal threshold values in an image can be summarized as the thresholding problem. It should be noted that the image histogram is used to determine the threshold points; therefore, each image has a set of optimal threshold values [123]. The Otsu and Kapur methods [124] are well-known methods for determining the optimal thresholds. However, the search for optimal multilevel thresholding (MTH) is a complex problem, and the challenges of MTH in image segmentation are explained in [125, 126]. The Otsu and Kapur methods are suitable methods for determining optimal thresholds in images that have a small number of thresholds, but when an image has a large number of thresholds, the segmentation accuracy becomes computationally expensive and the computing time is very large. Therefore, NIOAs and Swarm Intelligence Methods (SI) are used to solve complex image segmentation problems. NIOs mimic the behavior of birds, animals, and humans in nature to obtain optimal solutions. There are many types of NIOAs already applied to MTH segmentation problems, such as Manta Ray Foraging Optimization (MRFO) [37], Equilibrium Optimizer (EO) [127], Chimp Optimization Algorithm (ChOA) [128], Artificial Bee Colony (ABC) [129], Particle Swarm Optimization (PSO) [130], Bacterial Foraging Optimization (BFO) [131], and Cuckoo Search (CS) [132]. Genetic Algorithm (GA) with Simulated Binary Crossover (SBX) [133] is utilized for the segmentation of medical brain images to obtain optimal thresholds, and the results appear that the performance of GA with SBX crossover-based optimal multilevel thresholding for medical images is better than other comparator algorithms.

The authors in [134] presented a modified artificial bee colony algorithm (ABC) based on horizontal and vertical search mechanisms called CCABC to enhance the algorithm’s performance. Furthermore, the CCABC-based multilevel threshold image segmentation method is used to segment COVID-19 X-ray images and compare its results with other competitor algorithms, which proved its superiority in most of the comparison results. The ABC algorithm is proposed in [135] to determine the optimal threshold values for the detection of melanoma, and the high results of ABC used for the performance of the estimation have shown that the proposed detection of melanoma is better than other competitor algorithms. A novel hybrid approach based on a thresholding method is proposed in [136] to solve the image segmentation problem (ISP) for COVID-19 chest X-ray images by combining the slime mold algorithm (SMA) with the whale optimization algorithm to maximize Kapur’s entropy. However, the results of the hybrid approach proved that it outperforms in the comparison with all the metrics.

In [137], the authors proposed a new approach that combines Dynamic Particle swarm optimization (DPSO) with fuzzy c-means (FCM) to use the advantages of global optimization search and parallel computing of DPSO to determine superior results of the FCM algorithm. Furthermore, Magnetic Resonance Imaging (MRI) and synthetic images have been utilized to test the proposed approach by introducing various types of noise, and the results indicate that the new algorithm is less sensitive to noise and has a better performance compared to other competitors’ algorithms. Harris Hawks Optimization (HHO) algorithm is combined with chaotic initialization and the concept of altruism in [138] to define optimal thresholds during segmentation of brain Magnetic Resonance Images (MRIs), and the results obtained proved the superiority of the HHO compared to existing state-of-the-art methods. The authors in [101] utilize the monarch butterfly optimization (MBO) algorithm to segment medical images at multiple threshold values and estimate the performance of the implemented MBO algorithm. In addition, the results of the experiment show the advantage of utilizing the MBO algorithm on medical images. With respect to accuracy and speed, the MBO algorithm has proven superior in most experiments, especially at thresholds 3 and 4. The authors proposed a new multilevel image segmentation method in [139] based on an improved ant colony optimization algorithm to improve COVID-19 X-ray segmentation, which used the swarm intelligence algorithm (SIA) to ensure the validity of the experiments. In [140], the authors present the combination of the Harris Hawks optimization technique with Otsu’s method, which leads to a substantial decrease in computational costs while maintaining optimal results. The effectiveness of this innovative approach is assessed using publicly available imaging datasets, specifically chest images associated with clinical and genomic correlations from a rural COVID-19-positive population (COVID-19-AR). Comparative analysis using various performance metrics demonstrates that the proposed approach achieves a significant reduction in computational costs and convergence time while consistently producing results comparable to the Otsu method at the same threshold values. Traditional methods for multi-level threshold image selection face the challenge of increased time complexity as the number of threshold levels grows. To address this issue, the authors in [141] have introduced a novel and improved method called OLAVOA, which is derived from the modified African vultures optimization algorithm. This innovative approach combines predatory memory and a logarithmic spiral based on opposition learning, overcoming the limitations of traditional methods. The effectiveness of OLAVOA was verified through medical image segmentation experiments using chest X-ray images of COVID-19 patients and brain MRI images. The results demonstrated that OLAVOA outperformed alternative methods in segmentation tasks. Table 2 illustrates the summary of image segmentation based on thresholds using nature-inspired optimization algorithms.

Table 2 Summary of thresholding-based image segmentation using nature-inspired optimization algorithms

5.2 Clustering-based image segmentation using NIOAs

The cluster-based image segmentation is the process of collecting similar pixels together, such as the K-means clustering algorithm, the fuzzy clustering algorithm and other algorithms [153, 154]. The authors used the Fuzzy K-Mean algorithm (MFKM) with Bacteria Foraging Optimization (BFO) to determine the tumor region bounded between the portions of the edema and the regions of normal tissue in the brain by Magnetic Resonance (MR) brain images in [155]. The proposed approach is compared with Modified Fuzzy K-Means (MFKM), particle swarm optimization-based Fuzzy C- Means algorithm (FCM based on PSO), and the conventional FCM algorithm. The results of the proposed approach proved to be superior in MR brain image segmentation compared to competitive algorithms. In [156], a newly developed version of Red Fox Optimization (DRFO) is applied with Kernel Fuzzy C-means to detect skin cancer from dermoscopy images. This approach is executed in the ISIC 2020 database. The results of this method provided the best results compared to other competitive algorithms. The authors combined the shuffled shepherd optimization algorithm (SSOA) with the salp swarm algorithm (SSA) in [157] and called SSSOA. SSSOA based on the Generative Adversarial Network (GAN) model is used to detect lung cancer in Computed Tomography (CT) images. The results of the developed SSSOA-based GAN method obtain the best results in terms of accuracy, similarity and dice coefficient compared to other algorithms. The authors in [158] integrate the social ski driver algorithm (SSD) with the Shuffled Shepherd Optimization Algorithm called (SSSO) to detect lung cancer disease in computed tomography (CT) images. Therefore, use SSOA based on the Deep Renyi entropy fuzzy clustering (DREFC) algorithm to segment the lung lobes. The proposed approach significantly improved accuracy, specificity, and sensitivity compared to other algorithms. PSO algorithm and the Mahalanobis distance in [159] are utilized to improve the fuzzy c-means clustering algorithm (FCM) and called improved spatial fuzzy c-means (IFCMS) for image segmentation by using images of simulated brain MRI from the McConnell Brain Imaging Center database. The results of the proposed approach showed the efficiency of the method presented. In [160], Hybrid Sea Lion Squirrel Search Optimization (HSLnSSO) is used to improve the Fused Optimal Centroid K-means with K-Mediods Clustering (FOC-KKC) for the segmentation of dental caries. The results of the approach showed superior performance compared to competitive techniques.

5.3 Edge-based image segmentation using NIOAs

This subsection discusses edge detection imaging techniques using NIOA. The main goal of edge detection (ED) techniques is to identify the borders between two regions of the same image that are separated by their gray-level characteristics. As a result, many image processing applications use ED techniques, such as the detection of retinal blood vessels [161]. Additionally, the Prewitt operator, the Sobel operator, the canny operator, the Wallis operator, the Laplacian operator, and the Kirsch operator are used to determine the suitable edge of the objects. Each operator has a mask that it uses to detect edges by reducing the column of data, resulting in removing irrelevant features and preserving only the relevant features. Several works have been presented that are based on these concepts. For example, the authors of [162] proposed a new scheme that uses ant colony optimization (ACO) in image segmentation for the processing of MRI and iris images. The proposed approach can effectively segment images into finer details compared to other segmentation techniques, especially for images with difficult local texture situations. The authors in [163] used deformable registration to introduce a new geometric deformable model that combines edge- and region-based information with prior shape knowledge. This model consists of training and test phases. The training phase uses the genetic algorithm to learn the level set parameters, while the test phase applies segmentation methods to images. The results of the experiments provided better performance than those of other state-of-the-art methods, especially when segmenting anatomical structures from various biomedical image modalities. The modified watershed segmentation algorithm (MWS) is used in [164] to segment brain tumors in MRI images with the Xilinx Virtex-5 FPGA. The results of the proposed approach gave better results compared to other algorithms. In [165], the authors used ensemble deep neural networks with Particle Swarm Optimization (PSO) to segment the optical disc (OD) using retinal images. This technique consists of a modified PSO method, an accelerated super-ellipse action, a random leader-based search method, a refined super-ellipse method, and an average leader-based search method. PSO algorithm is used with Mask R-CNN to overcome the bias of single networks in OD. Additionally, the results of the technique have proven superior in existing studies on OD segmentation compared to other competitive algorithms to solve diverse unimodal and multimodal benchmark optimization functions and the detection of diabetic macular edema. of the retina poses a serious health risk to older people because they can result in blindness and vision loss. Therefore, the authors proposed a new OD detection method in [166] dependent on the Markowitz portfolio optimization to segment OD in retinal images. The authors utilized four different data sets for validation from the shown technique. These databases are Messidor, HRF, DRIVE, and the last data set obtained from the Hospital Universitario Sant Joan de Reus (Spain). The results of this method have proven to be superior compared with other competitive techniques. The authors in [167] have introduced a novel approach to curve detection by utilizing the Canny edge detector. The efficacy of this method was evaluated on magnetic resonance imaging (MRI) scans of brain tumors, specifically using images from the Neoplastic Disease section of the Whole Brain Atlas datasets from Harvard University Medical School. The results demonstrate the effectiveness of the proposed technique in accurately identifying tumor regions within the images, surpassing original active contour models such as CV, LBF, and LIF.

5.4 Region-based image segmentation using NIOAs

In the region-based image segmentation technique, the image is divided into a variety of regions that are created by separating or pairing pixels of their neighbor [168]. This kind of technique collects pixels that have the same characteristics inside one region and other pixels in other regions within an image. There is a number of publications in the medical field that depend on region-based image segmentation techniques. for example, a new framework is proposed in [169] using abdominal CT images with a multilevel local region-based Sparse Shape Composition (SSC) model and a hierarchical deformable shape optimization algorithm for liver segmentation in the portal phase. The results of the approach show slightly superior performance compared to other methods. In [170], the authors used the multi-objective particle swarm optimization (MOPSO) approach for the segmentation of Magnetic Resonance Imaging (MRI) of the human brain and solving the drawbacks of the region-based active contour method and fuzzy entropy clustering method. This approach is verified by using the Internet Brain Segmentation Repository (IBSR), real MR images from the McConnell Brain Imaging Center, and simulated MR images. Its results were shown to be superior in segmentation performance in terms of robustness and accuracy. In [171] a particle swarm optimization algorithm is combined with a robust graph-based (RGB) segmentation method to extract breast tumors using ultrasound images. The results of the approach showed that the proposed technique can segment ultrasound images of lesions more accurately compared to two conventional regional methods and RGB segmentation methods.

6 Challenges and future trends

Over recent years, significant effort has been dedicated to addressing challenges in image processing and segmentation. These challenges include tasks such as noise removal and image contrast enhancement. Image segmentation entails dividing an image into multiple homogeneous segments, each containing similar pixels. These similarities can be based on various factors such as color, brightness, intensity, or texture. The diverse array of segmentation features has led to the adoption of various segmentation techniques. Consequently, there is no single method for evaluating segmentation performance due to this diversity. Although some techniques rely only on intensity values for segmentation, they face difficulties in cases of noisy images or low contrast, making the segmentation process challenging. Previous studies have highlighted significant space for improvement in many image segmentation techniques. The primary challenges of image segmentation are delineated as follows.

  1. 1.

    The first challenge is selecting the right set of enhancement techniques to enhance the colored images is a challenging task. Therefore, improving many color models is considered a mandatory task because any image enhancement model can be run on both color- and gray-level images [148].

  2. 2.

    The main goal of medical image segmentation is to produce segmented images that contain the anatomical structure of specific cells of the human body, while the original medical image has geometric deformations or blurs; this is considered a challenge in medical image segmentation [150].

  3. 3.

    Another challenge is finding the best measurements and criteria over various types of images for a specific disease, such as images of heart disease, images of kidney disease, etc.

  4. 4.

    Most of the literature that is studied used various data sets received from different research agencies or clinics. Therefore, the main challenge of this case is that it is hard to compare the performance of models and algorithms across various studies.

  5. 5.

    The size of the segmented region may be larger or smaller than that of the original medical image, and this is another challenge.

In the last decade, many researchers have proposed various methods using NIOAs to solve image segmentation problems in the medical field. Therefore, the following points explain the future directions of image segmentation in the medical field using NIOAs.

  • The first important point is the hybridization method of the entropy, Kapur, and Otsu objective functions of Tsallis that should be improved to test the efficiency of their results on several medical images of different diseases [148].

  • Many NIOAs have not yet been utilized to optimize the segmentation process, so researchers should be improving other NIOAs in the medical field and evaluating their performance over various diseases [3].

  • Another critical point is that most researchers use only this type of image (ultrasound, mammograms, histological, and MRI). However, other types of images can boost the performance of cancer detection in humans, such as thermal images.

  • Many improvement techniques, such as global search, local search, and parameter adaption strategies, should be utilized for several NIOAs to produce efficient and accurate results in image segmentation.

7 Conclusion

Image segmentation is a fundamental phase in image processing and analysis. It is usually used in the image pre-processing phase. In this paper, we study the latest studies in medical image segmentation to detect various diseases in the medical field using nature-inspired optimization algorithms (NIOA). This paper splits image segmentation methods into four types, presented in Sect. 5. Five different popular medical image types are the strengths of the article, including ultrasound, mammograms, MRI, Thermography and histological images. Furthermore, we can conclude from previous studies of medical image segmentation using NIOAs that the most widely used objective functions are Kapur, Otsu, and minimum cross-entropy. The performance measures most used are PSNR, SSIM, FSIM, CPU time, and the value of the objective function, as explained in this paper. This paper also describes the types of NIOA published in the latest studies.