1 Introduction

Clinicians can identify diseases earlier thanks to medical imaging, which improves patient outcomes. Appropriate medical image analysis is essential to aid specialists and promote a healthy community. The application of image processing techniques for analyzing medical images is exceedingly successful, thanks to advanced medical equipment. Image segmentation, the first step in image processing, is one of the most significant and complex challenges in image analysis, particularly in the application of medical images. Segmentation divides an image into meaningful, nonoverlapping, homogeneous, connected regions concerning color similarity. Medical image segmentation aims to isolate anatomical objects of interest for analysis and is critical in medical imaging applications [1,2,3,4]. There is a wide variety of image segmentation methods, including threshold-based, clustering-based, region-based, edge-based, etc. [5,6,7,8,9]. Other hybrid image segmentation techniques combine multiple approaches [10]. Many years of research have examined segmentation features and methods. Nevertheless, one of the restrictions is that the appropriate number of segments is a parameter that must be established a priori, and determining this value is not an easy task [11]. In addition, the problem remains tough because, as the desired number of segments increases, the problem’s computing cost increases exponentially, making it unfeasible to employ accurate methods to search for all possible solutions exhaustively.

Computerized tomography (CT), magnetic resonance imaging (MRI), electroencephalography (EEG), and positron emission tomography (PET) are the four most prevalent forms of imaging techniques for the brain. However, magnetic resonance imaging (MRI) is the most frequently utilized method. MRI does not expose patients to radiation, has minimal invasiveness, and is widely available [12]. Furthermore, the MRI can discern more clearly between fatty tissue, water, muscle, and other types of soft tissue. In addition, MRI provides greater soft tissue contrast [13]. The information contained in these images is provided to medical professionals, who can utilize it to assist in the diagnosis of a wide range of illnesses and ailments. The goal of the segmentation of tumors in the human brain is to differentiate between normal and diseased tissues of the brain, specifically cerebrospinal fluid, gray matter, and white matter of the brain. When it comes to investigating brain tumors, it is currently simple to identify abnormal tissues; nevertheless, the segmentation process is challenging, making it difficult to reproduce results, characterize abnormalities, and achieve precision [13]. The term “tumor” refers to the condition that occurs when the uncontrollable expansion of cancer cells causes it. This tumor comes in various subtypes and manifestations, each of which can be treated with a specific modality according to its unique symptoms. Brain segmentation results in an image with labels that show the different regions’ boundaries or a group of contours. Segmentation methods can be further divided into bi-level segmentation methods, which divide an image into two parts, and multi-level segmentation methods, which divide an image into more than two parts [14]. Since MRI brain images contain more than two different types of regions, each of which may correspond to a unique object, the bi-level segmentation technique cannot be effective and results in under-segmentation.

As a result, multi-level image segmentation algorithms should be utilized to segment MRI brain images. Also, segmenting color images (RGB) is difficult due to the range of color intensities and three-color channels, unlike gray images, which have only one [15]. Therefore, various approaches can be taken to segment MRI brain images [1, 13]. Methods for segmenting MRI brain images can be divided into three broad classes: classification-based, region-based, and boundary-based methods [16, 17]. Fuzzy c-means is a typical classification-based approach widely used in medical image segmentation [18,19,20,21]. However, these clustering methods suffer from predefined values for determining the number of the proper segments.

Furthermore, the computational time is also a consideration since it depends on the number of clusters and image size. As a result, the paper’s motivation comes from automatically segmenting the image regardless of image size and selecting the appropriate number of clusters in addition to the actual peaks. To get around these problems, this study suggests using a 3D histogram and a modified multimodal particle swarm optimization method for brain MRI segmentation.

The cluster centers can be found by detecting the peaks in a three-dimensional histogram of a color image created from the RGB values of the pixels and smoothed with a Gaussian filter. The multimodal variation of particle swarm optimization (PSO) with a local search strategy has been utilized to find the global and local peaks in a histogram and the cluster centers. The contribution of the proposed clustering-based method is that a non-Euclidean distance metric has replaced the Euclidean metric for calculating the multimodal optimization algorithm’s pixel similarity and movement strategy. The number of clusters in the image can be automatically determined based on the number of PSO-detected peaks. Following the discovery of peaks, individual pixels are subsequently assigned to the cluster to which they are spatially closed using non-Euclidean distance, which results in the final segmented brain image. The proposed algorithm has been compared with the FCM [22] clustering algorithm.

The rest of this paper is organized as follows. Section 2 presents the FCM clustering. Section 3 describes the multimodal PSO algorithm. The proposed method is described in Section 4. Finally, section 5 presents the experimental and comparative results, while Section 5 concludes the paper.

2 Fuzzy c-means clustering

The k-means and c-means algorithms are two of the most well-known clustering methodologies in color image segmentation. These algorithms frequently yield good results and are widely used. Nevertheless, as mentioned before, one of the constraints is that the number of clusters is a parameter that must be determined in advance, and determining the value of this parameter is not easy. In addition, the computational time is a primary concern when solving the problem, as it depends on the required number of clusters and the image size. The c-means algorithm uses fuzzy memberships to put each pixel in the correct category.

$$j=\sum_{j=1}^N\sum_{i=1}^c{u}_{ij}^m{\left\Vert {x}_j-{v}_i\right\Vert}^2,$$
(1)

Where xij s a representation of pixel xjs membership in the ith cluster, vi s the center of the ith cluster, ‖∙‖ s a norm metric, and m is a constant. The fuzziness of the final partition is determined by the value of the parameter m, and usually m = 2.

The cost function is minimized when high membership values are assigned to pixels close to their clusters’ centroid. On the other hand, when low membership values are assigned to pixels with data far from the centroid, the cost function is maximized. This is because the membership function calculates the probability of a pixel being part of a particular cluster, given its location. This technique’s probability depends only on pixel distance from each cluster center. The following are the factors that contribute to the updating of the membership functions and cluster centers:

$${u}_{ij}=\frac{1}{\sum_{k=1}^c{\left(\frac{\left\Vert {x}_j-{v}_i\right\Vert }{\left\Vert {x}_j-{v}_k\right\Vert}\right)}^{2/\left(m-1\right)}}$$
(2)
$${v}_i=\frac{\sum_{j=1}^N{u}_{ij}^m{x}_j}{\sum_{j=1}^N{u}_{ij}^m}$$
(3)

The algorithm starts by randomly selecting the clusters’ centers and each cluster’s average location. The next step that c-means conducts is to assign an entirely random membership grade to each data point for each cluster. Finally, the means attempts to place the cluster centers in the correct location within a data set and calculates the degree of membership in each cluster for each data point by updating the cluster centers and the membership grades. This iteration tries to minimize an objective function that shows how far a given data point is from the center of a cluster based on how much that data point belongs to the cluster.

3 Multimodal PSO

While unimodal optimization algorithms can only identify a single global optimum (solution) within the collection of options, multimodal optimization algorithms can find many local and global optimum solutions [23, 24]. Even though multimodal optimization approaches have not been explored nearly as extensively as unimodal optimization methods, they have recently gained much attention. However, most of them have the same problem with niching parameters. Existing approaches have trouble figuring out the right niching radius, which is their main problem [25]. In the majority of studies, basic unimodal optimization algorithms, such as the Genetic Algorithm [26,27,28] and the PSO Algorithm [29,30,31,32], have been modified to become MMO algorithms. PSO’s movement (crossover) method is suited to adapt to a multimodal form. PSO is a nature-inspired method for unimodal stochastic optimization. t has been used to solve many computational problems [33]. PSO has been extended numerous times to a multimodal form in the literature. PSO mimics the swarming behavior of foraging birds to move particles toward the best solution. Therefore, PSO depends on the movement of particles in the search space to determine the best value. Each particle keeps track of its personal best position (i.e., personal best) and the overall best position (i.e., global best) gained by the population so far. Each particle (ith) is associated with a position vector and velocity vector that is recorded as xi and vi, respectively. xi and vi are updated according to the following equation:

$${\displaystyle \begin{array}{c}{v}_i\left(t+1\right)=w{v}_{i(t)}+{R}_1{C}_1\left({p}_i^{best}-{x}_i\right)+{R}_2{C}_2\left({g}^{best}-{x}_i\right)\\ {}{x}_i\left(t+1\right)={x}_i(y)+{v}_i\left(t+1\right)\end{array}}$$
(4)

Where t is the iteration number, w indicates the inertia weight, \({p}_i^{best}\) and gbest correspond to the location of personal best and global best, respectively. R1 and R2 are two uniformly distributed random numbers generated inside the interval [0,1]. C1 and C2 represent the particle’s confidence and its neighbors correspondingly. Unimodal PSO cannot locate multiple solutions as all solutions move to global best (gbest). However, PSO’s mechanism for particle motion can be easily adapted to handle multimodal problems. Carrera and Coello Coello (2009) introduced a modified PSO variation for solving multimodal problems inspired by electrostatic charge interactions [32]. Each solution moves toward the solution with the greatest electrostatic interaction calculated based on the current fitness value to locate multiple optimal solutions. These interactions are determined mathematically according to the following:

$${F}_{ij}=\frac{Q_i{Q}_j}{4\pi {r}^2{\varepsilon}_0}$$
(5)

Where Qij, r ≠ 0, and ε0 refer, respectively, to the electrical charges of the interacting particles, the distance between them, and the permittivity of the vacuum. To apply these ideas in an optimization context, each solution’s fitness value corresponds to the particles’ electric charge. Herein, Eq. 4 is simulated as:

$${F}_{ij}=\alpha \frac{f\left({p}_i^{best}\right)f\left({p}_j^{best}\right)}{{\left\Vert {p}_i^{best}-{p}_j^{best}\right\Vert}^2}$$
(6)

In this case, the constant scalar 4πε0 s replaced by the variable α, computed based on Li [34]. gbest in Eq.4 is replaced by \({index}_i={\displaystyle \begin{array}{c} argmax\\ {}\textrm{J}=1:\textrm{M}\end{array}}{F}_{ij}\) or a constant index j. Here, M denotes the population size.

$${\displaystyle \begin{array}{c}{v}_i\left(t+1\right)=w{v}_{i(t)}+{R}_1{C}_1\left({p}_i^{best}-{x}_i\right)+{R}_2{C}_2\left({p}_{index_i}-{x}_i\right)\\ {}{x}_i\left(t+1\right)={x}_i(y)+{v}_i\left(t+1\right)\end{array}}$$
(7)

4 Proposed method

In a research paper called 3DHP, all global and local peaks within a 3D color histogram corresponding to each cluster’s center points were located using the aforementioned multimodal approaches with a local search strategy. Pixel’s color in RGB-model images is derived from a weighted average of red, green, and blue components. t can represent each pixel in an image as a three-dimensional feature vector made up of the pixel’s three component colors. 3D histograms can then be constructed from these three-color axes [35]. The presence of peaks in a histogram indicates that the image comprises multiple distinct segments, each corresponding to a particular segment. In 3DHP, a three-dimensional Gaussian filter was applied to three-dimensional histograms to reduce the effect of noise and turn them out into smoothed histograms. This procedure also eliminates insignificant smaller peaks that may have been present in the histogram. Figure 1 illustrates the three-dimensional histogram, the original color distribution, and the color distribution after the smoothing process for the image of Lena. The 3D histogram was considered an objective function, and the positions of peaks were solution space. In this case, the number of pixels in the particular position corresponds to the fitness value. Moreover, an additional local search step proposed in [36] was integrated into multimodal PSO to enhance local search ability. Finally, the fitness values are used to check out the neighbors of the ith article. So, the following changes are made to the position of the ith particle:

Fig. 1
figure 1

Graphics of the 3D histogram, RGB color distribution, and Lenna’s smoothed color distribution. a) original Lenna image, (b) three-dimensional histogram of Lenna, (c) and (d) show the normal and smoothened RGB representation of Lenna [11]

$$\left\{\begin{array}{c}f\left( bestNeares{t}_i\right)\ge f\left( pbes{t}_i\right)\longrightarrow temp=\sum_{d=1}^D{p}_{d,i}^{best}+{C}_1.\mathit{\operatorname{rand}}.\left({p}_{d,i}^{best\_ nearest}-{p}_{d,i}^{best}\right)\\ {}f\left( bestNeares{t}_i\right)<f\left( pbes{t}_i\right)\longrightarrow temp=\sum_{d=1}^D{p}_{d,i}^{best}+{C}_1.\mathit{\operatorname{rand}}.\left({p}_{d,i}^{best}-{p}_{d,i}^{best\_ nearest}\right)\end{array}\right.$$
(8)
$$f(temp)>f\left( pbes{t}_i\right)\longrightarrow pbes{t}_i= temp$$
(9)

Where bestNearesti is the particle that is closest in the distance to the ith the article, D is the number of dimensions, and temp is a new position in the ith particle. A new position will then replace the particle’s position if it is determined that the new position is superior to the current position of the particle (xi). Consequently, all particles do not need to move to a single global optimum; other possible local solutions are not missed. In the next step, K dominant peaks are located. Then, K sets of peak intensity levels corresponding to cluster centers are automatically obtained for each RGB component. These peaks are represented as follows: \({p}_1^{rgb}=\left({r}_1,{g}_1,{b}_1\right),{p}_2^{rgb}=\left({r}_2,{g}_2,{b}_2\right),{p}_3^{rgb}=\left({r}_3,{g}_3,{b}_3\right)\),⋯, \({p}_k^{rgb}=\left({r}_K,{g}_K,{b}_K\right)\).

Additionally, to eliminate non-dominant clusters, it is beneficial to confine the distance that separates the two peaks as much as possible. Therefore, dominant peaks in a region eliminate all non-dominant peaks within its radius based on a distance limit parameter. It is essential to remember that this procedure is not mandatory and can be skipped if desired. For the 3DHP, this parameter was set to 80 pixels. Ultimately, every pixel will be assigned to the peak closest to it regarding the Euclidean distance. The following equation was used to calculate the Euclidean distance between the kth peak and the (i, j)th (pixel)

$$\left\Vert {p}_k^{rgb}-{I}_{i,j}^{rgb}\right\Vert =\sqrt{\left({p}_k^r-{I}_{i,j}^r\right)+\left({p}_k^g-{I}_{i,j}^g\right)+{\left({p}_k^b-{I}_{i,j}^b\right)}^2}$$
(10)

Also, a non-Euclidean distance criterion was proposed in [37] and then applied in [37] on color image segmentation using FCM. Therefore, this equation is calculated per:

$$ned\left({x}_i,{x}_j\right)=\sum_{a=1}^A1-{e}^{-1{\left({x}_{i,a}-{c}_{j,a}\right)}^2}$$
(11)

Where A is the number of features.

In the proposed method, the devisor of the fraction in Eq. 6\(\left(\left\Vert {p}_i^{best}-{p}_j^{best}\right\Vert \right)\) is replaced by Eq.7. This equation could also be expressed as:

$$ned\left({p}_i^{best},{p}_j^{best}\right)=\sum_{d=1}^{D=3}1-{e}^{-1{\left({p}_{i,d}^{best}-{p}_{j,d}^{best}\right)}^2}$$
(12)

Therefore:

$${F}_{ij}=\alpha \frac{f\left({p}_i^{best}\right)f\left({p}_j^{best}\right)}{ned\left({p}_i^{best},{p}_j^{best}\right)}$$
(13)

Also, after locating the histogram peaks, every pixel will be assigned to the peak (cluster head) closest to it in terms of the non-Euclidean distance instead of the Euclidean distance. Consequently, Eq.10 can be reformulated as:

$$ned\left({p}_k^{rgb},{I}_{i,j}^{rgb}\right)=1-{e}^{-1{\left(\left({p}_k^r-{I}_{i,j}^r\right)+\left({p}_k^g-{I}_{i,j}^g\right)+\left({p}_k^b-{I}_{i,j}^b\right)\right)}^2}$$
(14)

It is worth mentioning that a preprocessing step to smooth the image by Gaussian smoothing is applied to the RGB image before calculating the 3D histogram. Meanwhile, σ and the window size for this filter are set to 0.5 and (3 × 3), (respectively)

The flow diagram of the overall method is illustrated in Fig. 2.

Fig. 2
figure 2

The flow diagram of the overall method. The Gaussian filter smooths a 3D RGB histogram, eliminating non-dominating peaks. A multimodal particle swarm optimization method identifies peaks, assigning pixels to the nearest peak based on non-Euclidean distance, ensuring reliable and non-dominating histograms

5 Experimental results and performance evaluation

In this section, extensive experiments are performed on the proposed method. The results of the proposed method are compared with well-known FCM [22], FCM_FW [3], and FCM_FWCW [38]. The required parameters of the proposed method and their values are shown in Table 1. Also, the fuzziness parameter in all soft (fuzzy) clustering methods is set to 2. The maximum number of iterations for FCM, FCM_FW and FCM_FWCW is 100.

Table 1 Required parameters of the proposed method

5.1 Dataset

We used the following two datasets to evaluate the proposed method:

5.2 Evaluation metrics

As the brain MRI slices are heterogeneous, qualitative (visual) evaluation of different methods is insufficient to analyze the results accurately. Therefore, quantitative metrics are needed to evaluate the results of various methods [40]. In the experiments, the following two groups of metrics are used to measure the performance of algorithms.

  1. 1)

    Internals clustering metrics: A lower value of these metrics indicates a better segmentation result in this group.

F: this metric penalizes over-segmentation [41] (segmenting one region of the image into more than one segment):

$$\boldsymbol{F}=\frac{1}{1000\ \left(\textrm{M}\times N\right)}\sqrt{R}\sum_{i=1}^R\frac{{e_i}^2}{\sqrt{A_i}}$$
(15)

where M and N represent the length and width of the input image, R is the number of segmented regions, Ai indicates the number of pixels in the ith segmented region ei indicates the color error in region i, and \(\sqrt{R}\) represents a penalizing term that discourages over-segmentation.

\(\overset{\acute{\mkern6mu}}{\boldsymbol{F}}\) : this metric penalizes over-segmentation and is noise-robust [42]:

$$\overset{\acute{\mkern6mu}}{\boldsymbol{F}}=\frac{1}{10000\ \left(\textrm{M}\times N\right)}\sqrt{\sum_{A=1}^{\mathit{\operatorname{Max}}}{\left[R(A)\right]}^{1+\frac{1}{A}}}\sum_{i=1}^R\frac{{e_i}^2}{\sqrt{A_i}}$$
(16)

Q: this metric penalizes non-homogeneous regions [42]:

$$\boldsymbol{Q}=\frac{1}{10000\ \left(\textrm{M}\times N\right)}\sqrt{R}\sum_{i=1}^R\left[\frac{{e_i}^2}{1+\log {A}_i}+{\left(\frac{R\left({A}_i\right)}{A_i}\right)}^2\right]$$
(17)
  1. 2)

    Externals clustering metrics: A higher value of these metrics indicates a better segmentation result in this group.

Accuracy: is the number of correct prediction pixels divided by the total number of pixels. This metric is calculated by Eq. (18).

$$Accuracy=\frac{TN+ TP}{TN+ TP+ FN+ FP}$$
(18)

Precision: It is the ratio of correct positive prediction pixels to the number of positive pixels predicted. This metric is calculated by Eq. (19).

$$Precision=\frac{TP}{Tp+ FP}$$
(19)

Recall: It is the ratio of the number of correct positive prediction pixels to the number of all relevant pixels. This metric is calculated by Eq. (20).

$$Recall=\frac{TP}{TP+ FN}$$
(20)

F 1 Score: It is the harmonic mean between Precision and Recall. This metric is calculated by Eq. (21).

$$F1\ Score=2\times \frac{Recall\times Precision}{Recall+ Precision}$$
(21)

Specificity: The Specificity rate corresponds to the proportion of negative pixels that are correctly considered negative concerning all negative pixels. This metric is calculated by Eq. (22).

$$Specificity=\frac{TN}{TN+ FP}$$
(22)

In Eq.s (18) to (22), TP, FN, TN, and FN represent True Positive, False Positive, True Negative, and False Negative, respectively.

In our experiments, these metrics are expressed as a percentage. A high percentage indicates a better performance.

5.3 Experiment 1: Visualization-based analysis using internal metrics

In this section, we evaluate the proposed method with other methods qualitatively and quantitatively. Several images have been selected for the quality assessment of each dataset. In selecting these images, we tried to select images that include different types of tumors: small and large tumor sizes, different tumor tissues, spherical and non-spherical tumor shapes, and different lighting conditions. 23 images are selected from the first, 15 from the second datasets, and 15 images from third one.

Figures 3, 4, and 5 demonstrate the results of the visual qualitative analysis of the first, second and third datasets, respectively. The segmentation results for each method are displayed by employing a distinct color set to the base image to highlight the clusters obtained. Tables 2 (first dataset), 4 (second dataset), and 6 (third dataset) contain information regarding the peak locations of the three-dimensional histogram and the cluster centroids for each cluster achieved by FCM, FCM_FW [3] and FCM_FWCW. In the same way, Tables 3, 5, and 7 show the numerical and qualitative analysis of the results from each of the three tested methods.

Fig. 3
figure 3figure 3figure 3figure 3

Original MRI and segmented MRI slices by the M3DHP, FCM, FCM_FW and FCM_FWCW (first dataset)

Fig. 4
figure 4figure 4

Original MRI and segmented MRI slices by the M3DHP, FCM, and FCM_FWCW on the second dataset

Fig. 5
figure 5figure 5figure 5

Original MRI and segmented MRI slices by the M3DHP, FCM, and FCM_FWCW on the third dataset

Table 2 Cluster heads and centers of proposed and other methods (first dataset)
Table 3 Statistical results of the first dataset

In our tests, the number of peaks found by M3DHP is used to figure out the number of clusters (m) for FCM, FCM_FW and FCM_FWCW. However, within the FCM_FW and FCM_FWCW algorithms, there are scenarios where some clusters end up empty without any pixels being assigned to them. The algorithm might converge to a solution where one or more clusters do not have any data points associated with them. The issue can arise from various factors, including the selection of initial cluster centers, data distribution, or the fact that the specified number of clusters does not align with the natural clustering in the data.

Based on what is shown in Figs. 3, 4, and 5 for all images on both datasets, the background is clearly distinguished when using M3DHP. On the other hand, with FCM, the backgrounds of 11 images out of 23 on the first dataset are mistakenly divided into many regions and are over-segmented. Also, in the second dataset, with FCM, the backgrounds of 5 images out of 15 on the second dataset are mistakenly divided into many regions and are over-segmented. Also, in the third dataset, with FCM, the backgrounds of 5 images out of 15 on the third dataset are mistakenly divided into many regions and are over-segmented.

For TCGA_HT_8105_19980826_26, the tumor can be distinguished more easily with the M3DHP than with the FCM. Many pixels are mistakenly assigned to the same tumor cluster by the FCM. Moreover, for TCGA_FG_6688_20020215_24, the tumor in the bottom center of the brain is very clearly distinguished when using M3DHP. However, in this case, FCM against M3DHP is not successful. For TCGA_DU_7014_19860618_30, the proposed algorithm segments the image correctly, while the segmented image by FCM represents both brain and tumor regions in a single region. In this case, the image is under-segmented with the associated image. Visual inspection reveals that the M3DHP method generally yields more homogeneous segmentation regions (Tables 4, 5, and 6).

Table 4 Cluster heads and centers of proposed and other methods (second dataset)
Table 5 Statistical results of the second dataset
Table 6 Cluster heads and centers of proposed and other methods (third dataset)

As shown in Tables 3, 5, and 7, according to F and F′, the proposed method outperforms FCM in 12 out of 23 cases. Furthermore, according to Q, the proposed method outperforms FCM in 17 out of 23 cases. Results for the three evaluation functions F, F′, and Q suggest that the quantitative performance of both approaches is comparable when applied to the same image. It is important to note that there is not a huge disparity between these numbers; they all tend toward zero. The M3DHP approach demonstrates its efficacy by generating reliable results for the three statistical metrics F, F′, and Q. When it comes to F, F′, and Q, M3DHP typically offers superior performance to FCM in most cases. The last three rows of Tables 3, 5, and 7 indicate the average rank of algorithms according to all three performance indicators. The M3DHP ranked first according to all three performance indicators.

Table 7 Statistical results of the third dataset

In the comparative analysis of the three algorithms across various MRI slices using metrics F, F ́, and Q, our algorithm consistently demonstrates superior performance. It achieves the top mean rank in all three metrics, indicating its robustness and effectiveness in clustering. FCM_FW generally ranks second, outperforming the standard FCM, which consistently ranks last. The consistent top ranking of our algorithm across all metrics underscores its potential as a preferred choice for clustering tasks in varied contexts.

5.4 Experiment 2: Numerical analysis of all images using internal metrics

In this section, we evaluate the average results obtained on both datasets’ images. As shown in Tables 8 and 9, M3DHP has the best results. Regarding F criteria, after M3DHP, the method FCM_FWCW has better results. Also, in terms of F ́ criteria, after M3DHP, FCM_FWCW has better results. Regarding Q criteria, after M3DHP, FCM_FW in the first dataset and FCM in the second dataset have better results. Methods FCM_FWCW have almost high performance in terms of F ́and F criteria and have similar F ́and F to M3DHP. However, in terms of Q criteria are much larger than M3DHP. In the third dataset, similar other datasets, our model has first rank.

Table 8 The average performance of M3DHP and other state-of-the-art methods on all samples
Table 9 The mean rank of M3DHP and other state-of-the-art methods on all samples

These findings lead us to the conclusion that M3DHP can demonstrate competent performance during the segmentation of brain magnetic resonance images. The visual and numerical results show that the proposed M3DHP technique produces promising segmentation results. The method’s ability to automatically generate the desired number of clusters and cluster centroids proves this.

The thorough testing demonstrates that the proposed method performs well for image segmentation, surpassing the performance standards set by well-known methods like FCM, FCM_FWCW, and FCM_FW. The consistently superior outcomes across multiple evaluation criteria underscore the potential of the proposed method as a noteworthy contribution to the field.

5.5 Experiment 3: Analysis of all images with external metrics

In this experiment, to investigate the proposed method deeply and compare the obtained results with other state-of-the-art methods, the performance of the proposed method is evaluated with image ground truths and external evaluation metrics, such as accuracy, F1, precision, recall, and specificity. The statistical results are reported in Table 10. The visual segmentation results are also shown in Fig. 6. We test the proposed method only on the first dataset because the ground truth for the second one is not available.

Fig. 6
figure 6figure 6figure 6figure 6

Original MRI ground truth and segmented MRI slices by the M3DHP, FCM, FCM_FW and FCM_FWCW

Table 10 Statistical results of the first dataset with external metrics

Table 10 and Fig. 6 show that the proposed method has the best average performance for tested images. For image TCGA_DU_5855_19951217_23, tumor and non-tumor areas are well segmented, and due to the low light intensity, shape and texture, the other methods could not accurately detect the entire tumor area. Also, in the image TCGA_DU_7014_19860618_45, the border of the skull is segmented as a tumor area in the compared methods. However, the proposed methods are able to detect tumor areas well. This error in other methods is due mainly to the high brightness of the tissue surrounding the skull. The accuracy, F1-Score, and precision metrics rates are by an average of 95.52%, 79.68%, and 77.2257% on all testing images. This result highlights the efficacy of the feature combination employed in our method. The recall and specificity metrics in the proposed method are lower than those of the FCM_FW method. The reason is apparent: using feature weighting schemas and applying efficient extracted features can improve the results. However, the feature extraction phase is not used in our method, as are the FCM and FCM_FWCW methods.

6 Conclusion

In this paper, we suggest a modified form of the 3D Histogram-based segmentation technique that can choose the appropriate number of segments. The appropriate number of segments is determined by taking advantage of peak detection using a multimodal optimization algorithm. Using a multimodal optimization method, the optimal number of segments is calculated by exploiting peak detection. The proposed method has been applied to brain MRI to be segmented. The optimal number of clusters is unknown, making M3DHP more flexible for practice than other methods. To prove the efficiency of the proposed method, it has been compared with the well-known FCM clustering scheme. The results of the experiments demonstrate that the suggested strategy produces the desired outcomes and outperforms FCM. In our research, we developed a segmentation method for brain MRI that currently works with a single 2D MRI slice. The next step in our research will focus on extending this algorithm to handle 3D MRI segmentation.