1 Introduction

Image processing plays a critical role in digital applications like damage detection and visual recognition. Image segmentation is the first step of image analysis which separates an image into several regions with similar properties. Several segmentation approaches have been suggested in the literature, such as edge-based [15, 19, 26, 44, 47], region-based [12, 13, 22, 41, 46], deep-learning [21, 38], and histogram thresholding [8, 35].

Edge-based segmentation methods try to distinguish between areas according to boundary localization. In this method, edges are detected by rapid changes in the gray level in adjacent regions. Edge detection techniques are useful for images with high-contrast areas where regions are clearly separated. In other cases, this method generates ill-defined and discontinuous edges which do not lead to closed curves. Moreover, it is not immune to noise and produces worse results compared to other segmentation techniques. Several studies addressed the edge detection issue in literature [11, 17, 24, 27, 42]. Tchinda et al. proposed a novel segmentation of blood vessels based on edge detection techniques in retina images [31]. Baltierra et al. used the Ant Colony algorithm to identify edges in noisy images [5].

Region-based segmentation methods classify pixels with similar properties, such as intensity and neighborhood. There are two categories for region-based techniques: region growing and region splitting. In the region-growing method, a pixel is considered a seed point, and neighbor pixels with similar intensity join the seed point. In region splitting, pixels would be divided into two groups if they do not have sufficient segmentation properties. Region-based methods are noise-robust and have good results with homogeneous images. However, they are both expensive in the memory usage and execution time.

Moreover, the segmented image is not unique because the output depends on initial points, which are known as seed points. In recent years, several applicable methods have been reported for region-based segmentation such as document images [37], medical images [10, 25, 40, 43], and farm images [20]. Guo et al. suggested a region-growing framework based on a multi-information fusion network with good results on chest CT scans [14]. Chauhan and Goyal considered color burn images and introduced a region segmentation algorithm with 93.4% of accuracy [7].

Artificial Neural Network (ANN) simulates human-brain interactions and is known as deep learning. Recently, various type of deep-learning segmentation methods has been developed in the literature [34]. ANN learns and adjusts weights and can segment images by training. The sequence is simple, and the network works in parallel mode. However, the training procedure is lengthy, and initialization affects the output image.

Moreover, overfit may occur during the training set. Different methods have been developed for medical images to apply segmentation using the ANN method. Huang et al. introduced novel deep learning for medical data segmentation to reduce high dimensional complexity [18]. In another research, different models of deep-learning networks were compared on Colorectal cancer images [16].

Thresholding is a simple and fast segmentation method that does not need prior knowledge about images. Thresholding-based segmentation methods are used to extract objects from the background by assigning pixel intensities to various levels. Similar to other segmentation methods, thresholding techniques have disadvantages too. It does not yield acceptable results on images with flat histograms. Also, the thresholding method does not consider spatial details, and output regions will not be contiguous. Recently, various thresholding methods have been reported that use a meta-heuristic algorithm to find the best solutions, such as cuckoo search [6], whale optimization [1, 3], and Particle Swarm Optimization (PSO) [45].

This paper presents a novel segmentation method based on EM (Expectation Maximization) and GMM (Gaussian Mixture Model). The significant contributions of this study are summarized as follows:

  • A novel approach is proposed to correct image segmentation based on EM. Expectation Maximization tries to estimate a histogram by GMM. In this procedure, some gaussian distributions may be covered and have no chance of being selected as a threshold level. In this case, the number of classes is reduced and desired threshold levels cannot be obtained. To overcome this drawback, a mechanism is considered in this study to maintain the number of desired levels.

  • The EM algorithm is sensitive to initial points. The poor starting points results in trapping the algorithm in local optima, and premature convergence will occur. To mitigate this shortage, a nature-inspired algorithm, namely Salp Swarm Algorithm (SSA) is used to help EM jump out of local areas.

This study is organized as follows: in Section 2, the theory of multilevel thresholding and expectation Maximization based on the Gaussian Mixture Model is reviewed and Salp Swarm Algorithm is described briefly, and necessary formulas are given. Section 3 refers to the proposed algorithm that combines the EM algorithm with SSA to achieve better performance. Section 4 compares the proposed algorithm to the traditional EM algorithm and four state-of-art algorithms in image segmentation. The experimental results with evaluation criteria are presented in this section. Finally, discussions are concluded in Section 5, emphasizing the efficiency of the proposed algorithm for medical image segmentation.

2 Basic theory

In this section, the definition of the thresholding problem is clarified. Then, the necessary information on Expectation Maximization (EM) is discussed, and the concept of the Salp Swarm Algorithm is briefly explained.

2.1 Multilevel thresholding

Image thresholding is often considered one of the most challenging and intriguing segmentation techniques. Consider an image I of size m × n with L distinct gray levels. Consider an image I of size m × n with L distinct gray-levels. The major purpose of thresholding is to determine a threshold vector T = [t1, …, tK − 1] for an image I to separate its pixels into K groups. Each group can be defined by

$${\displaystyle \begin{array}{c}{G}_1=\left\{\left(x,y\right)\in I|0\le f\left(x,y\right)\le {t}_1-1\right\}\\ {}{G}_K=\left\{\left(x,y\right)\in I|{t}_{K-1}\le f\left(x,y\right)\le L-1\right\}\end{array}}$$
(1)

where Gk is kth defined group and (x, y) indicates the location of the pixel in the image I with gray level intensity f(x, y) between 0 and L − 1. The threshold vector T can be determined by a thresholding criterion like Otsu between classes [30]. However, this is not a proper segmentation metric. In 2011, Xu et al. proved that Otsu threshold biases toward the group which has a larger variance value [39]. Another way to segment an image is EM algorithm based on GMM which is described as follows.

2.2 Expectation maximization

An Expectation-Maximization algorithm is an approach for performing maximum likelihood estimation in the presence of latent variables. It does this by first estimating the values for the latent variables, then optimizing the model, then repeating these two steps until convergence. It is an effective and general approach and is most commonly used for density estimation with missing data, such as clustering algorithms like the Gaussian Mixture Model. Assume there are N samples drawn independently from D dimensional space and denoted byX [4]

$$X=\left\{{x}_1,\dots, {x}_N\right\}\kern2.25em X\in {\mathcal{R}}^{N\times D}\kern2em {x}_n\in {\mathcal{R}}^{1\times D}$$
(2)

The glossary of terms and parameters with their definitions are presented in Table 1.

Table 1 Glossary of terms and parameters for the proposed MBA

To decompose a histogram of an image, Gaussian Mixture Model is used which is formulated as

$${\displaystyle \begin{array}{c}p\left({x}_n\right)={\sum}_{k=1}^K{\pi}_kN\Big({x}_n\left|{\mu}_k,{\Sigma}_k\right)\ \\ {}\sum \limits_{k=1}^K{\pi}_k=1\kern0.75em ,\kern0.75em {\pi}_k>0\end{array}}$$
(3)

In this formula, N is Gaussian function and μk, Σk, πk indicate mean, variance and coefficient of N. The purpose is to estimate μk, Σk and πk which maximize the likelihood function:

$$L\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=P\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)={\prod}_{n=1}^Np\left({x}_n\right)={\prod}_{n=1}^N{\sum}_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)$$
(4)

where

$$\pi \in {\mathcal{R}}^{1\times K}$$
$$\boldsymbol{\mu} \in {\mathcal{R}}^{K\times D}$$
$${\Sigma}_{1:K}\in {\mathcal{R}}^{D\times D}$$

To reduce the complexity of formulas, log-likelihood of Eq. (4) is used as

$$LL\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=\sum \limits_{n=1}^N\ln \left(\sum \limits_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)\right)$$
(5)

However, this problem cannot be solved. Because the labels of observation are not available. To reduce the complexity of the problem, we define a label variable as

$$Z=\left\{{z}_1,\dots, {z}_N\right\}\kern2.25em Z\in {\mathcal{R}}^{N\times D}\kern2em {z}_n\in {\mathcal{R}}^{1\times D}$$
(6-a)

All zn elements are 0 expect one element. zn shows the class of xn.

$${z}_{nk}\in \left\{0,1\right\}\kern0.5em ,\kern1em \sum \limits_{k=1}^K{z}_{nk}=1\kern0.5em$$
(6-b)
$$p\left({z}_{nk}=1\right)={\pi}_k$$
(7)

Probability of zn is calculated as

$$p\left({z}_n\right)=p\left({z}_{n1},\dots, {z}_{nk}\right)=\prod \limits_{k=1}^K{\pi_k}^{z_{nk}}\kern0.75em$$
(8)
$$p\left({x}_n|{z}_{nk}=1\right)=N\left({x}_n|{\mu}_k,{\Sigma}_k\right)\kern0.5em$$
(9)
$$p\left({x}_n|{z}_n\right)=\prod \limits_{k=1}^KN{\left({x}_n|{\mu}_k,{\Sigma}_k\right)}^{z_{nk}}$$
(10)
$$p\left({x}_n\right)=\sum \limits_{n=1}^Np\left({x}_n|{z}_n\right)p\left({z}_n\right)=\sum \limits_{n=1}^N\prod \limits_{k=1}^K{\pi_k}^{z_{nk}}N{\left({x}_n|{\mu}_k,{\Sigma}_k\right)}^{z_{nk}}=\sum \limits_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)$$
(11)

The log-likelihood function is formulated as

$$L\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=P\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=\prod \limits_{n=1}^Np\left({x}_n\right)=\prod \limits_{n=1}^N\sum \limits_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)$$
(12)
$$LL\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=\sum \limits_{n=1}^N\ln \left(\sum \limits_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)\right)\kern0.75em$$
(13)

To solve the problem, γ is defined as the probability of assigning a sample to the specific cluster

$${\displaystyle \begin{array}{c}\gamma \left({z}_{nk}\right)=p\left({z}_{nk}=1|{x}_n\right)=\frac{p\left({z}_{nk}=1\right)p\left({x}_n|{z}_{nk}=1\right)}{\sum_{j=1}^Kp\left({z}_{nj}=1\right)p\left({x}_n|{z}_{nj}=1\right)}\\ {}=\frac{\pi_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)}{\sum_{j=1}^K{\pi}_jN\left({x}_n|{\mu}_j,{\Sigma}_j\right)}\kern9em \end{array}}$$
(14)

derivation with respect to μ and Σ, leads to obtain optimum values

$$\frac{\partial }{\partial {\mu}_k} LL\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=0\kern0.5em$$
(15-a)
$$\frac{\partial }{\partial {\Sigma}_k} LL\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=0\kern0.5em$$
(15-b)

πk can be identifed by lagrange multiplier

$$\frac{\partial }{\partial {\pi}_k}\left\{P\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)+\lambda \left(\sum \limits_{k=1}^K{\pi}_k-1\right)\right\}=0\kern0.75em$$
(16)

However, we cannot find a closed-form answer for this problem. EM algorithm is used to estimate unknown parameters which its pseudo code is shown in Fig. 1.

Fig. 1
figure 1

EM pseudo code for GMM

1- initialize parameters μk, Σk and πk.

2- E-step: compute probability of assigning a data point to a cluster

$$\gamma \left({z}_{nk}\right)=\frac{\pi_{\mathrm{k}}N\left({x}_n|{\mu}_k,{\Sigma}_k\right)}{\sum \limits_{j=1}^K{\pi}_jN\left({x}_n|{\mu}_j,{\Sigma}_j\right)}$$

3- M-step: update GMM parameters based on calculated γ(znk)

$${\mu}_k^{new}=\frac{1}{N_k}\sum \limits_{n=1}^N\gamma \left({z}_{nk}\right){x}_n$$
$${\displaystyle \begin{array}{c}{\Sigma}_k^{new}=\frac{1}{N_k}\sum \limits_{n=1}^N\gamma \left({z}_{nk}\right)\left({x}_n-{\mu}_k^{new}\right){\left({x}_n-{\mu}_k^{new}\right)}^T\\ {}{\pi}_k^{new}=\frac{N_k}{N}\kern3.25em where\kern0.75em {N}_k=\sum \limits_{n=1}^N\gamma \left({z}_{nk}\right)\end{array}}$$

4- check the convergence. If the log-likelihood converges, the proceture will terminate

$$LL\left(X|\pi, \boldsymbol{\mu}, \boldsymbol{\Sigma} \right)=\sum \limits_{n=1}^N\ln \left(\sum \limits_{k=1}^K{\pi}_kN\left({x}_n|{\mu}_k,{\Sigma}_k\right)\right)$$

The process can be described as follows: First, we select some initial values for the means and mixing coefficients (μk, Σk and πk). Then, we alternate between the following two updates called the E (expectation) step and the M (maximization) step. In the expectation step, the current values for the model parameters are used to compute the posterior probabilities γ(znk). In the maximization step, the responsibilities are used to estimate the model parameters (e.g., means and mixing coefficients). Finally, the log-likelihood is computed and checked for convergence.

2.3 Salp swarm algorithm

Salp Swarm Algorithm (SSA) was suggested by Mirjalili et al. for solving complicated problems based on the behavior of living salps at sea [28]. Assume population G is randomly generated, and the good source is denoted as F. The group tries to find better food sources at the sea. The location of the leader is changed according to the following formula:

$${\boldsymbol{Gf}}_j^1=\left\{\begin{array}{c}{F}_j+{q}_1\left[\left({ub}_j-{lb}_j\right){q}_2+{lb}_j\right]\kern1.5em {q}_3\ge 0\\ {}{F}_j-{q}_1\left[\left({ub}_j-{lb}_j\right){q}_2+{lb}_j\right]\kern1.25em {q}_3<0\ \end{array}\right.\kern0.75em$$
(17)

Where \({\boldsymbol{Gf}}_j^1\) is the leader of the group (first individual of the group), j is referred to j-th dimension of the space and Fj is denoted as the food location. q2 and q3 are drawn from the uniform distribution. q1adjusts the exploration ability by following equation:

$${q}_1=2{e}^{-{\left(4\frac{iter}{\max \_ iter}\right)}^2}$$
(18)

Where iter means the current iteration and the maximum number of iterations is denoted as max _ iter. The location of follower members is changed based on the following equation

$${\boldsymbol{Gf}}_j^i=\frac{1}{2}\left({\boldsymbol{Gf}}_j^i+{\boldsymbol{Gf}}_j^{i-1}\right)$$
(19)

Where \({\boldsymbol{Gf}}_j^i\) indicates the position of i-th agent (i ≥ 2) in current iteration j. SSA has two advantages over other metaheuristic algorithms: high convergence and escaping from local areas. SSA performs a smooth balance between exploration and exploitation to track the global optimum as well as faster convergence. Also, following the leader and random walking helps the algorithm not to trap in the local points [2].

3 The proposed algorithm

EM algorithm based on GMM is a powerful and fast method in image segmentation. This algorithm is formulated according to Bayes’ theorem. EM estimates a histogram with a mixture of Gaussian functions. However, it cannot guarantee the number of pre-defined classes. For example, Fig. 2a traces a histogram of a grayscale image with Gaussian mixture functions, which has been estimated by the EM algorithm. EM decomposes the histogram into three Gaussian functions properly. However, Gaussian function two is covered by Gaussian function one and Gaussian function three. It means that the second class has no chance of winning in the segmentation competition. In this case, we have only two classes (Gaussian function one and Gaussian function two) instead of three classes. To overcome this shortage, a mechanism is considered in this paper. Assume that the desired number of segmentation regions is K and we want to segment the image into these K classes. In the process, M classes are covered and the image is segmented into KM groups. To compensate for this reduction, M classes are created according to M classes which have been covered by other Gaussian functions. Fig. 2b shows a covered Gaussian function. In this situation, a new class is created with μi ± dist range of intensity where μi indicates the mean value of i-th covered gaussian function and dist is distance from μi which identifies the intensity limitation of the new class. It means that every pixel with intensity between μi ± dist is assigned to this class. This technique recovers missed classes and improves the quality of the segmented image.

Fig. 2
figure 2

Missed class in GMM and recover it

The second drawback of the EM algorithm is the convergence problem. EM is a greedy algorithm and converges to the nearest optimum point. No mechanism is considered in this algorithm to jump out of the local areas. For example, Fig. 3-a shows the Log-Likelihood changes over a tuning parameter. This parameter can be a mean, variance, or Gaussian coefficient. EM algorithm tries to climb up the curve to reach the top of the local points (P1, P2 and P3) and no mechanism is considered in the algorithm to escape from local areas. Therefore, the Log-Likelihood always has a saturation curve. We addressed this shortage by a mechanism in this paper. The EM algorithm is equipped with a natural-inspired algorithm, namely Salp Swarm Algorithm (SSA). SSA helps the EM algorithm to jump out of local areas and find a better solution. For example, in every five iterations, SSA searches GMM parameters (means, variances, and coefficients) and introduces a suggestion point to EM. EM algorithm evaluates the suggested solution. If the suggested solution has a better Log-likelihood, the EM algorithm jumps to the suggested solution and continues the convergence from this point. SSA finds an optimum solution according to the RMSE fitness function, which is defined as follows:

$$fitness\ function=\sqrt{\frac{\sum_{l=0}^{L-1}{\left( histogram-{\sum}_{j=1}^K{\pi}_jN\left({\mu}_j,{\Sigma}_j\right)\right)}^2}{L}}\kern0.5em$$
(20)
Fig. 3
figure 3

Log-Likelihood convergence diagram a) EM convergence behavior according to different initial points b) SSA suggests a better solution (green point) to escape from local traps

Where L indicates the number of intensity levels, Fig. 3b shows a convergence process where SSA finds a better solution and introduces it to the EM algorithm (green point). In this example, EM considers the green point as the initial point and continues calculations from this point.

The pseudo-code and flowchart of the proposed method are shown in Fig. 4 and Fig. 5, respectively. The algorithm starts with GMM parameters initialization (μk, Σk and πk). In every iteration, new means, variances, and coefficients are calculated (\({\mu}_k^{new}\), \({\Sigma}_k^{new}\) and \({\pi}_k^{new}\)). If EM goes to the saturation area, the algorithm will be terminated. In every five iterations, SSA is handled and finds a local solution in the search space. First, μk, Σk and πk are initialized and SSA tries to update the location of search agancies by Eq. (19). If the termination criterion is satisfied then SSA finishes and the best soution which is known as \({\boldsymbol{Gf}}_j^1\) is fed into EM algorithm. If this point has a higher Log-Likelihood than the current point, EM considers the suggested solution as the initial point and continues the convergence process from this point. Else, the iteration increases, and the EM algorithm calculates new GMM parameters.

Fig. 4
figure 4

Pseudo-code of the proposed method

Fig. 5
figure 5

Flowchart of the proposed method

4 Results and discussions

This section considers some practical experiments on sets of medical grayscale images. PSNR and FSIM, as two evaluating metrics, are employed in this paper to show the superiority of the proposed method. PSNR is used to measure the robustness of segmenting algorithms and is

$$P\mathrm{S} NR=10\log \left(\frac{255^2}{MSE}\right)\kern1em$$
(21-a)

Where MSE is expressed as

$$MSE=\sqrt{\sum \limits_{i=1}^M\sum \limits_{j=1}^N{\left(A\left(i,j\right)-B\left(i,j\right)\right)}^2/ MN}\kern0.5em$$
(21-b)

Where A and B indicate the original and segmented image, respectively, and M and N are the size of images. FSIM is Feature Similarity Index and is formulated as [23]

$$FSIM=\frac{\sum_{x\in \Omega}{S}_L(x)P{C}_m(x)}{\sum_{x\in \Omega}P{C}_m(x)}$$
(22)

Where Ω is the entire image, SL(x) refers to similarity between the segmented image and the original image, and PCm(x) shows the phase consistency. The higher value of PSNR or FSIM means better segmentation quality. Similar to other optimization problems, setting the parameters is the first step to solve the problem. Table 2 indicates setting parameters to run EM algorithm and the proposed method. Each algorithm is implemented in MATLAB language environment with 2.7 GHz CPU and 8 GHz RAM memory.

Table 2 Setting parameters of EM algorithm and the proposed method under test

A set of 16 medical lung images are utilized to measure the performance of the introduced algorithm and are shown in Fig. 6. These images are selected from the COVID-19 Radiography Database [36]. A team of researchers from Qatar University has created this database of chest X-ray images for COVID-19-positive cases, along with Normal and Viral Pneumonia images.

Fig. 6
figure 6

Lung images used in the experiments

Fig. 7 shows the histograms of Fig. 6 images estimated by the EM algorithm and the introduced method. In this simulation, the aim is to decompose the histogram into ten classes. In other words, the algorithms look for GMM parameters that perfectly fit the histogram. An actual image has a discontinuous histogram which Gaussian Mixtures cannot cover with soft changes. So, the error of estimation is inevitable.

Fig. 7
figure 7figure 7

Estimation of test images histogram by EM algorithm and introduced method

Table 3 lists the number of missed classes covered in the EM algorithm. k indicates the number of desired classes. For example, for image 1, we want to segment the histogram into 15 groups (k=15). However, 4 Gaussian functions are covered by the others, and we get only 11 classes. In the proposed algorithm, the image is segmented into 15 classes. Because the covered functions are detected, and a range of intensity is assigned to them. In this paper, dist=4 is considered for an intensity range (look at a covered Gaussian function presented in Fig. 2). Generally, the probability of missed classes is increased with an increasing number of segmentation classes.

Table 3 Number of missed classes in EM algorithm

Table 4 lists PSNR values of segmented test images by the EM algorithm and the proposed algorithm. Better results are presented in boldface. From obtained PSNR values mentioned in Table 4, we find out the proposed method has a better performance in 41 cases out of 48 cases, while the EM algorithm performs better in only 7 cases. If an image is segmented into 255 levels, the output image will be similar to the original image and PSNR will be infinitive. Generally, with an increasing number of segmentation parts, PSNR values will also be increased. Consequently, when some Gaussian functions are covered in the EM algorithm, and the number of segmentation parts reduces, the PSNR metric is also degraded.

Table 4 PSNR values of test images obtained by EM algorithm and the proposed method

Moreover, FSIM values are calculated for test image segmentation. The results are depicted in Table 5. The higher values are presented in boldface. It can be seen from this table that the proposed method scores better results in most cases. Generally, higher segmentation numbers mean less difference between the original and output images and a higher FSIM metric. According to Table 3, the EM algorithm yields the missed classes in most cases, which is the main reason the EM algorithm cannot score better FSIM values.

Table 5 FSIM values of test images obtained by EM algorithm and the proposed method

In order to visually compare the performance of the introduced method and the EM algorithm, segmented test images with k = 20 are shown in Fig. 8. As can be seen from the segmented images, it is clear that the proposed algorithm is superior to the EM algorithm and includes more details. This is because the higher level of segmentation yields smoother results and less difference between the input image and the segmented image. For example, in Test 6, the proposed algorithm’s segmented image has more details than the EM algorithm. In some cases, such as Test 2, the EM algorithm has rough results because of missed classes.

Fig. 8
figure 8figure 8figure 8figure 8

Results of segmenting different test images by EM algorithm and the proposed approach

Fig. 9 presents convergence curves of test images to improve understanding of the studied algorithms’ segmentation performance. SSA equips the proposed algorithm to jump out of local optima, while the EM algorithm does not have this mechanism. As told in Section 3, in every five iterations, SSA starts and finds a candidate solution in the search space and introduces it to the EM algorithm. If the candidate solution has a higher Log-likelihood, the EM algorithm considers it the initial solution, jumps to this point, and continues the convergence process. In some cases, such as Test 1, Test 2, Test 10, and Test 13, the proposed solution has a better Log-likelihood value. Hence, the EM algorithm interrupts the previous solution and starts the process from these points. So-called test images have discontinuous convergence curves depicted with red circles in Fig. 9. With an increasing number of iterations, the probability of finding a better solution is reduced. Because the EM algorithm climbs up to reach the top of the local optimum points and in this case, the chance of finding a solution with an upper location compared to local optima will be reduced. Hence, in most cases, such as Test 1, Test 10, and Test 13, the break-point occurred at the beginning of the search process.

Fig. 9
figure 9figure 9

Convergence curve comparisons of 16 test images

In order to understand the speed of the proposed algorithm, the execution time of studied methods is calculated and presented in Fig. 10 for different test images. As seen in this figure, the computational time of the proposed algorithm is higher than the EM algorithm, which is the disadvantage of the introduced method. Although SSA is a fast algorithm, its computational time is added to the total execution time. Two techniques are considered in the proposed method to optimize the running time:

  1. a)

    the fitness function is defined as Eq. (20) instead of Log-likelihood, a less complicated formula.

  2. b)

    SSA starts every five iterations instead of every iteration, saving the total time.

Fig. 10
figure 10

Comparison between EM algorithm and the proposed method in terms of execution time to obtain the best result

As discussed, most jumping points in convergence curves occur at the beginning of the process. The execution of SSA can be limited to the first 50 iterations of a run to save more computational time.

In order to compare the ability of the proposed algorithms, four recent segmenting algorithms namely ECSO [35], FCS [6], BOA [33], and EMA [32] are implemented on medical images shown in Fig. 11 to prove the superiority of the proposed method. This dataset contains 1000 CT scans of patients diagnosed with COVID-19 and segmentations of lung infections made by experts [29]. This dataset aims to encourage the research and development of effective and innovative methods to identify if COVID-19 infects a person through the analysis of his/her CT scans.

Fig. 11
figure 11

Test images used to evaluate the ability of proposed algorithm

FSIM and PSNR are selected to compare the efficiency of different methods and the accuracy of the obtained solutions. In order to have an identical condition, the initial members of searching methods were selected from a uniform distribution between [0, 255]. The number of iterations was set to 100, and the population size was assigned to 200 for all algorithms to obtain a fair comparison between searching algorithms. Other adjusting parameters are set according to the reference papers as listed in Table 6. The goal is to segment the image into ten classes. The segmented images are shown in Fig. 12.

Table 6 Tuning parameters used for different algorithms
Fig. 12
figure 12figure 12figure 12figure 12

Segmented test images after multilevel thresholding (levels = 10)

It is evident from these figures that the FCS algorithm gets blurred results, and ESCO gets better than the FCS algorithm. However, these two algorithms are less competitive than other algorithms. Also, the BOA algorithm yields over-segmentation results in some cases, such as in Image 5. These figures show that the EMA algorithm and the proposed method can better segment different images. However, the main limitation of these methods is their sensitivity to noise.

In addition, PSNR values of output images are computed to show a better perception of the segmentation quality, and the results are listed in Table 7. The highest values are marked in boldface. These results show that the proposed algorithm has better PSNR values than other compared algorithms because it obtains the best results in 4 out of 8 cases.

Table 7 Comparison of PSNR values computed by different algorithms

Table 8 shows the FSIM values extracted from test images. These results indicate that BOA obtains the first rank because it performs best in 3 out of 8 cases. The proposed method comes in the second rank with 2 cases, and other algorithms have only one.

Table 8 Comparison of FSIM values computed by different algorithms

For further analysis, the CPU time results for each algorithm are recorded and shown in Fig. 13. From this figure, the EMA algorithm achieves the best results. BOA obtains the second rank followed by the FCS algorithm; it is ranked third. In contrast, the proposed method is considered the slowest algorithm in the experiments.

Fig. 13
figure 13

Comparison of CPU time consumption (in milliseconds) of studied algorithms

Another experiment is done on studied algorithms to show the fair comparison between algorithms, which is the Wilcoxon test [9]. This statistical test is based on two hypotheses. The null hypothesis is described as: there is no significant difference between two algorithms where the alternative hypothesis proves a difference. pvalues are used to reject the null hypothesis. If pvalue<0.05, it indicates the significant difference between the two groups of the results. Table 9 lists the calculated pvalues of the Wilcoxon signed-rank test. Values greater than 0.05 are shown in bold. According to obtained results, it could be seen that p-values are less than 0.05 for 30 out of 32 cases. In other words, the superiority of the proposed method is proven in the Wilcoxon rank test.

Table 9 Calculated p-values of the proposed method versus other algorithms

5 Conclusion

This work suggests an improved segmentation method based on the EM algorithm. In order to overcome shortages of the EM algorithm in image segmentation, two mechanisms are considered: First, covered Gaussian functions are detected, and a class is assigned to each function to compensate for missed classes. Second, SSA is applied to find new solutions. Perhaps these candidate solutions have better fitness functions. In this case, the EM algorithm terminates the previous answer and starts the process from these points. This helps the algorithm to jump out of local areas. Twenty-four medical images are selected to show the performance of the proposed algorithm. By comparing the EM algorithm and the proposed method, obtained results on segmentation metrics such as PSNR and FSIM clearly prove the higher accuracy of the proposed method. Also, convergence curve results demonstrate that the proposed algorithm can fly away from local optima with SSA suggestion points. The Wilcoxon rank test confirms the meaningful superiority of the proposed algorithm. The only disadvantage of the introduced method is the execution time which can be improved in future work by employing some shortcut techniques. The algorithms were sorted from good to bad in terms of running time; it was seen that they were sorted in the order of EMA < BOA < FCS < ESCO<proposed method. Based on FSIM values, the performance of studied algorithms can be sorted in the order: ESCO=FCS = EMA < proposed method < BOA. On the other hand, BOA has much better and more efficient performance in the FSIM metric. In comparison, the proposed algorithm achieved the second rank. When the algorithms were sorted from bad to good in terms of PSNR, it was seen that they were sorted in the order of ESCO=FCS=BOA = EMA < proposed method. Therefore, it is concluded that the proposed method performs better in the segmentation process. In contrast, the proposed method obtained the second rank.

In future work, to further improve the segmentation quality, work on the parameter sensitivity of SSA will be accomplished when applied to the EM algorithm. Also, the suggested methodology would be helpful for different applications like brain magnetic resonance image segmentation, breast cancer thermogram image segmentation, and other natural grey-scale or RGB image segmentations.