1 Introduction

The human body consists of different cells having individual special functions. The cells grow normally, old cells die and new cells grow again. Moreover, the cell loses its ability to control its growth, new cells are produced when the human body does not require them, and old cells do not die, as they should. The build-up of additional cells regularly procedures the mass tissue called tumor [2, 75]. The brain is the focal part of the central nervous system, which consists of non-replaceable soft tissues and spongy. The brain is a sensitive organ, which has three major parts (Cerebrum, Cerebellum, and Brainstem). It receives the information from our body, senses assist as a master for different operations inside the body and permit us to manage our environment. Children have brain tumors in the posterior cranial fossa and adults have in anterior parts, two-thirds of the intellectual or cerebral hemispheres. The tumor can distress any portion of the brain. In the beginning phase, tumors do not contain cancer cells. In the second phase, brain tumors [22] take shape as metastatic tumors. A malignant brain tumor contains cancer cells. The unnatural growth, uncontrolled growth, and division of the cells are due to cancer [32]. The uncontrolled cell growth in the brain tissues is called a brain tumor. The brain tumor is one of the most lethal cancers. The tumor originates from the brain tissue cells, called a primary brain tumor. In some cases, cells develop cancerous at any one part of the body and spread into the brain, called a metastatic tumor. Gliomas are also one type of tumor, which is originated from, glial cells. The early-stage diagnosis of gliomas improves the treatment possibilities of the patient.

A cancerous nub may evolve somewhere else in the human body, and these cell discharges from the primary tumor. The discharged cells pass in the blood vessels and the lymphatic system is deployed through the blood circulation accumulated in the brain. MRI [44] is an innovative medical image processing method, used for visualization of the internal structure of the body and produces ultra-resolution images of the body parts. In MRI, internal structures of the body are obtained for diagnosing using a strong magnetic field and radio waves. From these high-resolution images, we can examine the development, location, and segmentation of tumor abnormalities.

According to the study done by American Brain Tumor Association [60, 65], 14,000 people die annually due to brain cancers. In 2021, more than 84,000 people will be treated for primary brain tumors, while more than 700,000 people in the United States now have brain tumors. According to the annual statistics of Central Brain Tumor of the US [72], people have120 types of brain tumors. The statistics are horrible and cases are increasing day by day.

The main motivation for the doctors is to learn about brain tumor diagnosis methods and include new clinical methods to prevent it. The new methods will explore the imaging scans and better track for possible track and watch the tumor growth [1]. The researchers are inspecting biomarkers that will support the diagnosis of the brain tumor to predict whether any particular treatment can be applied to a patient’s prognosis. Radiologists use MRI pictures to examine the patient [15] and learn about the tumor’s location, type and size to plan treatment and provide an accurate diagnosis. It is quite difficult for a person who has survived a brain tumor to remain optimistic. Humans can experience cognitive and physical changes in themselves, making them unable to live their lives as they live before the tumor. Immunotherapy [19] is the biological therapy used to enhance the human body for the natural defense against the tumor. It follows the materials and vaccines to restore the immune system [79] for dendritic cells. The tumor detection and segmentation [6] help with oncolytic virus therapy, blood-brain disruptions, gene therapy for the replacement of tumor growth, and combining new drugs to enhance the resistance power of tumor cells. The tumor MRI segmentation study helps the practitioners to know the better ways to reduce the side effects and reduce its symptoms for better treatment and quality of life [13].

The segmentation and classification [18, 31] of MRI images desire to be very proficient in the suitable analysis of brain tumors. Perfect segmentation of brain tumors from MRI images is a very challenging and crucial task in treatment planning and diagnosis that comprises the extraction of the images from one or more regions and creating the area of interest. Different algorithms have been advanced in brain tumor detection such as region-based methods, threshold-based techniques, classification approaches, deformable methods, and deep learning models [62]. The deformable models are among the best popular approaches used in MRI images for brain tumor segmentation. Extensive work has been carried out in the direction to explore the capabilities of the MRI image segmentation that can provide some meaningful information from medical data. Research on brain tumors has been carried out to investigate different types of tumors, and the shape of cerebrospinal fluid that enhances the complexity of tumor detection. Regardless of the supremacy of the segmentation technique of MRI images employed, the quality of segmentation methods depends significantly on the contrast of medical images, incomplete boundaries, and the extent of noise. Deep learning plays a very important role in the detection of brain tumors. There are several algorithms for image segmentation such as thresholding method, region-based segmentation, fuzzy clustering, k-means clustering, neural networks, level set method, Otsu’s method, neutral networks, and watershed algorithm, etc. The CNN [3, 81] has been widely used for tumor classification and detection. The CNN has a deep-layered architecture [56] as well as model scaling, which allows the model size to be raised for improved accuracy. It is CNN’s most significant advantage over other segmentation methods and the realization of the specific hardware for artificial intelligence systems. The problem statement of the work is to analyze the performance of the different image segmentation techniques for brain tumor detection based on different parameters such as response time, accuracy, recall, and precision. The main contribution of the paper is to study the different image segmentation techniques including CNN and apply them to the BRATS dataset to estimate the simulation time and performance. This study will help the researchers to identify the best approach for tumor segmentation and pre-estimate the simulation and performance parameters.

2 Related work

Okaili et al. [4] discussed the strategy to differentiate intra-axial brain masses and determine the accuracy of MRI images. The institutional review board approved the strategy for conventional MRI, perfusion MRI, proton MR spectroscopy, diffusion-weighted MRI, and classification of intra-axial masses as low-grade primary neoplasms, metastatic neoplasms, and high-grade primary neoplasms, with a Bayesian statistical approach used to determine the system accuracy. Anila et al. [9] used the concept of multi-resolution and noise removal to detect the abnormal behavior of the brain. The multi-resolution was based on curvelet and countrelet-based approximation. The counterlet method has proved better results to recognize the abnormality in the brain. Balafar et al. [10] discussed the different techniques for brain tumor detection and segmentation for MRI images. The accuracy of the tumor detection depends on the segmentation methods. They discussed Markov’s random model, watershed algorithm, anatomical deviations, Atlas-based segmentation, multi-region-based method, self-organizing maps, and learning vector quantization methods. It is suggested that brain segmentation can be improved if the atlas-based method and parallelization are combined. Chaudhary et al. [14] studied the concept of image segmentation using the clustering technique, and a support vector machine (SVM) is used for detecting brain tumors. The classifiers were able to detect the seven features, and SVM proved 94.6% accuracy. Gholipour et al. [20] addressed that automatic functional localization and functional brain imaging are very important for temporal and higher resolutions of brain tumors. The functional maps are helpful for the identification of dementia that can provide many differences between healthy and tumor patients. It will lead to valid interpretations and conclusions. Ratan et al. [66] reviewed different methods for tumor segmentation from MRI data, as the process is time-consuming identified by medical experts. They pointed out different methods such as intensity, texture, region-based, clustering, classification, fuzzy, neural network, knowledge, edge, probabilistic, fusion, SVM, level set methods, watershed, Atlas guided, morphology, Fuzzy C means, and k- means clustering-based algorithms. They suggested that the combined approach of thresholding with SVM or Basian could provide the best results for brain tumor detection. Ratan et al. [68] proposed an algorithm to detect tumors from MRI images through symmetry analysis and calculate the area of the tumor. They use median filters, thresholding & morphological operation to detect a tumor. Li et al. [45] used a unified level set technique for semi-automatic liver tumor segmentation to integrate prior information on the tumor, image gradient, and regional competition. It was applied directly on contrast-enhanced computed tomography (CT) images, and unsupervised fuzzy clustering was used for the probabilistic distribution of liver tumors. Thapaliya et al. [74] used the level set method for brain tumor segmentation using an automatic selection of local statistics. The threshold values of images were rationalized, and adjusted automatically for all MR images. Goel et al. [23] discussed the watershed algorithm and level set method for MRI image segmentation of brain tumor regions detection. The comparative study is carried out to estimate the performance and response time in MATLAB. The level set method has provided a good response in comparison to Otsu’s method. Mustaqeem et al. [54] discussed that brain tumor detection is possible using watershed segmentation, threshold segmentation, and morphological operators. They successfully simulated the samples of human brains using scanned MRI images. Patil et al. [63] used a watershed algorithm and morphological operators for the detection of tumors from brain MRI images. The noise removal functions, regions segmentation, and morphological operators were used for the scanned MRI images. Remya et al. [67] used the Fuzzy- C means method for noise filtering of MRI images. The method exactly researches the exact identification of the cerebrum tumor. Otsu’s method was applied for image segmentation. The authors claimed that their approach of fuzzy-C means provided good results even the patient thinks about the tumor. Sain et al. [69] explained the structure of the human brain and proposed an algorithm to detect the tumor in the brain using Otsu’s method for segmentation. Amin et al. [8] used DWT for image fusion that provided complete information about the brain tumor regions of MRI. The partial diffusion filter is used to eliminate the noise and the global thresholding technique is applied for tumor segmentation. Khode et al. [38] used DWT for brain tumor detection. MRI is a very important method in many cases and is capable to provide a detailed image analysis of the human body. The MRI images were used for the test, and the tumor was segmented from these images. Kumar et al. [39] suggested the use of DWT for image segmentation and determined the vertical, inclined, and circular regions. They use HAAR DWT in which the image was distributed into four sub-bands “LL”, “LH”, “HL”, and “HH” with its coefficients either + 1 or -1. Shree et al. [71] addressed that the identification, detection, and segmentation of the exact position of a brain tumor in MRI images is a very time-consuming and tedious process. The system needs fast transformation and computations as MRI images have many modalities and noise. They used DWT for image segmentation, with morphological filters to remove the noise, and system performance was improved. Singh et al. [73] suggested the use of DWT and discrete cosine transform (DCT) for image processing applications. They used the DCT and DWT for image compression and watermarking. The processes involve image segmentation operations and the best results have been obtained from DWT. Viji et al. [76] suggested that watershed segmentation is used for the automatic detection of brain tumors and computer-aided design (CAD) tools can be used for tumor identification in the manual segmentation process. This helps for 2D and 3D tumor image visualization for accessing tumor and surgical planning. Liao et al. [49] used medical joint photographic experts group (JPEG) images for the adaptive data hiding applications to preserve the difference between the DCT coefficient values and the adjacent DCT blocks at the same position by embedding the inter-block changes for patient’s security. The concept of reversal hiding for the image data encryption and decryption [46] was used to estimate the complexity of each block with a minimum bit error rate. The discrete Fourier transform DFT [48] and compressive sensing were used to separate the data hiding for image encryption and decryption for security. Joseph et al. [35] explained tumor segmentation using K-means clustering and morphological operators were used to avoid misclustered regions. Zhang et al. [80] used the adaptive Wiener filtering method for denoising. Different morphological operators and functions were used for eliminating the non-brain tissue. The K-means + + clustering method was used with the combination of Fuzzy- C- means a method to segment images. The clustering technique not only advanced the algorithm’s stability but also decreased the sensitivity of clustering parameters. Moeskops et al. [52, 53] used CNN for the automatic segmentation of MR brain images into some tissue classes. The segmentation technique was applied to five different data sets with different age groups and accurate results were obtained. Olszewska et al. [58, 59] discussed the performance of the autonomous intelligent vision system that was evaluated based on false-positive rate, false-negative rate, accuracy, and precision. Pereira et al. [64] used CNN for image segmentation of MRI images of the brain with a small kernel (3 × 3). The small kernel helps in the design of deeper CNN architecture. The results were verified on the BRATS-2013 and BRATS 2015 databases. Seetha et al. [70] proposed automatic brain tumor detection with the help of CNN’s deeper architecture and classification. The simulation is done using the Python programming language. The CNN architecture consists of three layers with small kernels. The CNN achieved 97.5% accuracy on the (BRATS) 2015 testing dataset. Wang et al. [77] used the multiscale CNN to segment the dermoscopic image. The image was first preprocessed using contrast enhancement and segmented to enhance the dataset using the segmentation loss function. Kaur et al. [36] applied different machine learning methods such as random forest, SVM, decision trees, and K-NN support to evaluate the performance of the real-time data for remote halt monitoring in which random forest predicted 96.42% accuracy for breast cancer and other diseases, which was high in compassion to other algorithms. Dargan et al. [16, 17] presented a compressive review on the need for deep learning for medical images and further use of machine learning to evaluate the performance. Deep learning introduces new data-processing techniques and infrastructures, allowing computers to learn different representations and objects. The framework for the biometric recognition system was discussed that comprises the serval process and sequences for the identification of behavioral modalities. Kumar et al. [41] predicted the COVID-19 cases and deaths in Italy, Spain, Japan, India, US, and UK. The model was followed using auto-regressive integrated moving average (ARIMA), and long short-term memory (LSTM) methods. Ghosh et al. [21] applied the machine learning algorithms such as KNN, support vector machine, and multi-layer perceptron (MLP) for Deoxyribonucleic acid (DNA) microarray data. Bansal et al. [11] used machine-learning algorithms for object recognition. The features of the objects were extracted using the different classifies such as decision tree, KNN, and random forest with 80.8%, 74.8%, and 85.9% accuracy respectively. Kumar et al. [12] applied machine learning for face detection and face recognition from an arbitrary image to get the human insight and knowledge to recognize and study the face data. Gupta et al. [25] used the scale-invariant feature transform (SIFT) and speed up robust feature (SURF) with random forest and decision tree machine learning to study the different features for the face [42] and achieved 99.7% accuracy.

3 Image segmentation methods

There are various techniques for image segmentation. The description of some techniques is given.

3.1 Otsu’s method

Otsu’s technique [61] is an image segmentation method based on image thresholding. It is the non-linear operation, used to convert an image from grayscale into a binary. In this technique, two levels are assigned to this operation, foreground pixel and background pixel. It is also called a bimodal histogram. It is based on statistical approaches [47] used to configure the threshold range. This technique is based on the minimization of all weighted sums of one-class variances of foreground and background pixels to generate a prime threshold. The prime threshold is computed by dividing two classes so that the combined spread is minimal. Deriving the same concept and maximizing the class variance is beneficial. The statistical approach of stationary objects is used to identify the items, rather than spatial coherence. The approach has been locally adapted in nature, with trails that indirectly equalize brightness so that the presence of objects can cause differences due to changes in bimodal brightness behavior. The threshold values are searched in Otsu’s approach [29], which reduces intra-class variation. The weighted total of variances of the two distinct class variances determines it. The intraclass variance is defined as the variance of a specific class. Equation (1) gives the expression for ‘within-class variance’ in weighted probability

$$S_w^2\left(x\right)=w_1\left(x\right)S_1^2\left(x\right)+w_2\left(x\right)S_2^2\left(x\right)$$
(1)
\({w}_{1}\) :

probability related to class-1

\({w}_{2}\) :

probability related to class-2

x:

threshold that divides the probabilities of two classes

S1:

variance related to class-1

S2:

variance related to class-2

The equation of class probabilities, following the L bins of the histogram, is presented as

$$w_1\left(x\right)=\sum_{j=0}^{x-1}p\left(j\right)$$
(2)
$${w}_{2}\left(x\right)=\sum _{j=x}^{L-1}p\left(j\right)$$
(3)

The equation of the class means is presented as

$${\mu }_{1}\left(x\right)=\sum _{j=0}^{x-1}\frac{j.p\left(j\right)}{{w}_{1}\left(x\right)}$$
(4)
$${\mu }_{2}\left(x\right)=\sum _{j=x}^{L-1}\frac{j.p\left(j\right)}{{w}_{2}\left(x\right)}$$
(5)

\({\mu }_{1}\left(x\right)\) and \({\mu }_{2}\left(x\right)\) presents the means relating to the class-1 and class-2 respectively and \({\mu }_{X}\) is total mean. The equation of the total mean is presented as

$${\mu }_{X}=\sum _{j=0}^{L-1}j.p\left(j\right)$$
(6)
$${\mu }_{X}= {{w}_{1}\left(x\right).\mu }_{1}\left(x\right)+ {w}_{2}\left(x\right).{\mu }_{2}\left(x\right)$$
(7)

The summation of the class probabilities is equal to 1.

$${w}_{1}\left(x\right)+{w}_{2}\left(x\right)=1$$
(8)

The sum of the within-class or weighted variances, as well as the variation across classes, determines the total variance. For the definite threshold, it is calculated by adding weighted squared distances between the total mean and class-specific averages. The individual class variance is calculated using the equation [74, 79].

$${S}_{1}^{2}\left(x\right)= \sum _{j=0}^{x-1}{\left[(j-{\mu }_{1}(x)\right]}^{2}\frac{p\left(j\right)}{{w}_{1}\left(x\right)}$$
(9)
$${S}_{2}^{2}\left(x\right)= \sum _{j=x}^{L-1}{\left[(j-{\mu }_{2}(x)\right]}^{2}\frac{p\left(j\right)}{{w}_{2}\left(x\right)}$$
(10)

The expression for the total variance is presented as

$${{S}^{2}=S}_{w}^{2}\left(x\right)+{w}_{1}\left(x\right)\left[1-{w}_{1}\left(x\right)\right]{ [{w}_{1}\left(x\right)-{w}_{2}\left(x\right)]}^{2}$$
(11)
$${{S}^{2}=S}_{w}^{2}\left(x\right) + {S}_{b}^{2}\left(x\right)$$
(12)
\({S}_{w}^{2}\left(x\right)\) :

Variance within the class

\({S}_{b}^{2}\left(x\right)\) :

Variance between the class

The total variance is unaffected by the variables ‘x’ and ‘constant.‘ The result is a change in which the aids of both phrases are simply moved back and forth. According to Otsu’s method, minimizing inter-class variance is the same as maximizing inter-class variance. The important point is that recursively calculating quantities in \({S}_{b}^{2}\left(x\right)\) is likely since it violates the boundaries of ‘x’.

  • Calculate the histogram and probability for each level of intensity. The levels are threshold values that are obtained by maximizing the inter-class variance of MRI Images [24, 55].

  • Set up original \({w}_{j}\left(x\right)\) and \({\mu }_{j}\left(x\right)\)

  • Forward with all possible numbers of x = 1, 2 … to achieve maximum intensity. For 8-bit MRI the values are in the range [0: 255], and the levels are defined in this range.

  • Update the values of \({w}_{j}\left(x\right)\) and \({\mu }_{j}\left(x\right)\)

  • Calculate the value of \({S}_{b}^{2}\left(x\right)\)

  • The probable threshold is attainable corresponding to the maximum calculated value of \({S}_{b}^{2}\left(x\right)\).

3.2 Watershed algorithm

The watershed algorithm [27, 37] is a transformation used in a grayscale image. The name indicates the word “Metaphorically” about a geological watershed or drainage divide, which means it, is used to separate the adjacent drainage basins. The algorithm treats the image as topographic maps and the brightness of each pixel denotes the height used to determine the lines adjacent to the top of the ridges.

The watershed is said as the bridge that separates the areas drained using different river systems. The catchment basin is the specific geographical region, which drains into a recursion or river. The algorithm is very much applicable for analyzing the behavior of biological tissues, studying different galaxies, and finding new things in semiconductor technology. The computer analysis is used to understand which pixels are related to each object and make a decision based on it. The process is the repetition of objects from the background. These objects can be anything like DNA microarray elements, printed pages, blood cells, brain tissues, semiconductor dots, etc. The watershed is obtained using distance transform in which the distance from each pixel is calculated for its nearest non-zero valued pixel. The computed image is one catchment basin spanning a complete image and one coherent basin is applicable for each object. It is required to negate the distance transform and get knowledge about two bright regions of catchment basins. Each catchment basin of each object is called a watershed function that contains a positive integer value for the position of each catchment basin. Then zero values elements are used at the location of watershed lines to separate the objects in the original image.

3.3 Level set method

The method is a conceptual framework based on level sets used for numerical analysis of shapes and surfaces. It is based on the non-parametric model estimation with minimum contour energy to determine minimum distance. The level set technique is relied on embedding two central images. First, inserting the interface at zero levels, which is set about the standard function, and the second interface is about setting the highest dimension level set function. The method is used to maintain the surface governed by the level set equation, obtained by the iterative operation and updating the values of ‘φ’ in each time interval. The equation of the level set method [7] is given as

$$\frac{\partial\mathrm\varphi}{\partial\text{t}}=\text{v}.\left|\nabla\mathrm\varphi\right|$$
(13)

\(\mathrm\varphi\) denotes the level set function, and \(\nabla\mathrm\varphi\) denotes the change in the level set function. The contour [57, 59] deforms based on speed ‘v’, which is based on contour curvature and image gradient features. ‘v’ is the velocity term and decides the level set evaluation in the different levels of the image. For image segmentation, ‘v’ depends on the data and related curvature function. The equation is given in the form [7]

$$\frac{\partial\mathrm\varphi}{\partial\mathrm t}=\left|\nabla\right|.\left[\mathrm\alpha\mathrm X\left(\mathrm I\right)+\left(1-\mathrm\alpha\right)\right]\frac{\nabla\mathrm\varphi}{\left|\nabla\mathrm\varphi\right|}$$
(14)

Here, X (I) = data function works on target function and

Function, \(\frac{\nabla\mathrm\varphi}{\left|\nabla\mathrm\varphi\right|}\) updates the level set and keeps it smooth.

3.4 K -Means clustering

The K-means clustering algorithm [34, 55] is an unsupervised algorithm used to segment the interest regions from the background of an image. The given image is partitioned into clusters and each cluster has its centroid [50]. After that, get the sum of all squared distances between all points and the cluster center. For a given set of operations (x1, x2….xn), the objective function [5] is presented as,

$$\mathrm Q=\sum\limits_{\mathrm a=1}^{\mathrm k}\sum\limits_{\mathrm b=1}^{\mathrm n}\left\|\mathrm x_{\mathrm b}^{\mathrm a}-{\mathrm C}_a\right\|^2$$
(15)

Here,

Ca:

presents the centroid for cluster ‘a’

\({\text{x}}_{\text{b}}^{\left(\text{a}\right)}\) :

case ‘a’

n:

denotes number of cases

k:

denotes number of clusters

Q:

objective function

\(\left\|\mathrm x_{\mathrm b}^{(\mathrm a)}-{\mathrm C}_a\right\|^2\) :

denotes the distance function

3.5 Discrete wavelet transform (DWT)

The DWT [51] is based on the 2D HAAR decomposition method in which the image is segmented row-wise and column-wise. The multi-resolution decomposition method is used in 2D images to segment the image in the ‘L’ sub-band and ‘H’ sub-band. The image is divided into four sub-bands: LL, LH, HL, and HH which contain extensive information about the image [28]. The sub-bands in which the original image is processed through a combinational filter such as a low pass and a high pass filter are shown in detail in Fig. 1a & b. Figure 2 displays a HAAR DWT example in which the picture (64 × 64) is separated into four bands: LL(32 × 32), LH(32 × 32), HL(32 × 32), and HH(32 × 32). The LL band is further divided into four bands: LLLL (16 × 16), LLLH (16 × 16), LLHL (16 × 16), and LLHH (16 × 16). For better refining, the specific region of the decomposed image, morphological operations such as erosion, dilation, opening, and closing can be employed, and the coefficients of the wavelet transform can be obtained using image arithmetic operations.

Fig. 1
figure 1

a Sub-bands in DWT processing [12b Filter banks in DWT processing [33]

Fig. 2
figure 2

Example of HAAR DWT [12]

The DWT decomposes the input brain MRI images into a collection of sub-images with varying resolutions and frequency regions. The input image is subdivided into multiple frequency bands, which is done via sub-band coding with the help of the filter bank. A filter bank is a set of filters in which all have the same input or output to share a common input that constitutes an examination unit, while they share a common output from a fusion bank. The filter bank converts the image in the frequency domain. The first step in seamless reconstruction is to assess the input image low filter bank and the tumor database-related similar perception. The filter channel bank is divided into two portions, the first of which is the analysis section and the second of which is the synthesis section. The analysis stage decomposes the input image into a series of sub-band components, which are then used to rebuild the original signal from its modules in the synthesis section. Subband analysis and synthesis filters are designed in such a way that the filter behavior is alias-free and meet the perfect signal reconstruction property. The elimination of aliasing, phase distortion, and amplitude in real-time should result in a faultless reconstruction of these filter banks, making them more suitable for sub-band coding and multi-resolution image decomposition. The filter bank separates the MRI image into two equal frequency bands during operation. Low-pass filters H0[z] and high-pass filters H1[z] are the filter banks in the DWT. These signal outputs at levels 1 and 2 after filtering are provided. At the low pass, filter level processing (1),

$${\text{F}}_{0}\left[z\right]= \text{F}\left[z\right]{\text{H}}_{0}\left[z\right]$$
(16)

At the high pass filter level processing (2),

$${\text{F}}_{1}\left[z\right]= \text{F}\left[z\right]{\text{H}}_{1}\left[z\right]$$
(17)

The sampling frequency of the signal is too high after this filtering procedure [33]. As a result, the down-sampling technique rejects half of the samples. After that, Eqs. (16) and (17) are used to calculate the Z-transform. At level (3) processing, the output is given as 

$$\mathrm Y\left[z\right]=\frac12\left\{\mathrm F\left[z\;1/2\right]\cdot{\mathrm H}_0\left[z\;1/2\right]+\mathrm F\left[-z\;1/2\right]\cdot{\mathrm H}_0\left[-z\;1/2\right]\right\}$$
(18)

At level (4) processing, the output is given as

$$\mathrm Y\left[z\right]=\frac12\left\{\mathrm F\left[z\;1/2\right]\cdot{\mathrm H}_1\left[z\;1/2\right]+\mathrm F\left[-z\;1/2\right]\cdot{\mathrm H}_1\left[-z\;1/2\right]\right\}$$
(19)

Based on two filtered and decimated signals, the synthesis filter bank restructures the signal. Expansion or interpolation is a synthesis approach that involves multiplying the signals in each branch by two. By inserting zeros between consecutive samples, this interpolation is achieved. The Z-transform of the signal at level 5 and level 6 is given in Eqs. (18) and (19), respectively, after interpolation. At level (5) processing, the output is given as

$$\text{X}\left[z\right]=\frac12\left\{\text{F}\left[z\right]\cdot{\mathrm H}_0\left[z\right]+\text{F}\left[-z\right]\cdot{\mathrm H}_0\left[-z\right]\right\}$$
(20)

At level (6) processing, the output is given as

$$\text{X}\left[z\right]=\frac12\left\{\text{F}\left[z\right]\cdot{\mathrm H}_1\left[z\right]+\text{F}\left[-z\right]\cdot{\mathrm H}_1\left[-z\right]\right\}$$
(21)

The wavelet is used to extract the coefficients from the brain MRI, and localize the frequency details of the signal function for the classification. The MRI image is decomposed into spatial low-frequency components that are taken from the LL sub-bands. The HL frequency components are having high performance in comparison to LL. The high-frequency components LH, HL, and HH present the horizontal, vertical, and diagonal components in the first and second levels of decomposition. The LL and HL bands are combined for MRI image analysis. The quantitative information of the image is extracted such as color feature, contrast, and shape. The DWT gathers statistical features from the brain tumor image and feeds them into probabilistic neural network classifiers [71] as inputs for training and testing, as well as classifying the image as normal or abnormal.

3.6 Convolutional neural network (CNN)

The convolutional neural network [31, 32] is one type of deep neural network, in which the information passes through the right side of the model and no feedback is present in the model. It is a feed-forward artificial neural network, processed through different layers [30]. CNN has been widely used by many researchers for different applications in computer vision. The CNN model architecture used for brain MRI images is shown in Fig. 3. The CNN architecture consists of three layers with two sequential convolutional layers, pooling layers, and a third fully connected layer for classification. The 3D convolution operation is performed using the convolution layer, directly used to map the features of the image with the convolution kernel [62, 78]. The pooling layer, which is utilized for subsampling and is applied to a pooling function over a spatial window without the pooling kernel corresponding to each output feature map [81], is the next step. The pooling layer output is coupled to the third layer, which gives the feature vector. The original MRI images (64 × 64 × 1) processed by two-convolution kennels (32 × 32 × 1) and (16 × 16 × 1), embedded with (2 × 2 × 1) pooling kernel for subsampling. Table 1 presents the size of the different MRI images processed in the layered architecture of CNN.

Fig. 3
figure 3

CNN architecture

Table 1 Image size with different layers in CNN

The MRI images are processed using the LeNet architecture, which consists of convolutional and pooling layers followed by a fully connected layer based on Softmax activation. In deep learning of MRI images, LeNet CNNs use a shortcut connection layer called max pooling between convolutional layers to minimize the spatial size of images, preventing overfitting and allowing CNNs to train more successfully.

4 Methodology

The methodology of brain tumor detection is depicted in Fig. 4. The steps involved in the methodology are brain MRI inputs, preprocessing, image segmentation, and analysis. The MRI is the computation method to get the visual information of the brain-related medical data. MRI is the best method for monitoring, surgery, and diagnosis. The radio waves are used in MRI and detailed information is archived using a magnetic field to get the data about brain tissues and brain stem. The testing in MRI is different from CT scan, as it does not contain radiation. The preliminary section is the collection and acquisition of images from the BRATS data set (https://www.med.upenn.edu/sbia/brats2018/data.html). Pre-processing is the essential step and applied before segmentation that includes the required operations to make diagnostic more obvious from input MRI images. In the pre-processing stage of images, the visual quality is improved and filtering operation is carried out to eliminate the different noise levels. The presence of such disruptions makes abnormal/normal tissue perception and hence exact interpretations are more challenging. Segmentation divides an image into multiple segments and describes the separation of the suspicious region from pre-processed MRI images to make the simpler image that is more meaningful and easier to examine. In this process, different segmentation methods such as watershed transformation, thresholding, level set method, K-means clustering, Otsu’s method, DWT and CNN are used. Thresholding is also the part of segmentation used to create binary images. It uses a fixed constant value to compare the pixel of an image. If a pixel value is less than the threshold value, then it replaces that pixel value with the black pixel. If a pixel value is greater than the threshold value, then it replaces the pixel value with the white pixel.

$$\begin{array}{*{20}c}\mathrm G\;(\mathrm m,\;\mathrm n)=\{1\;\mathrm F\;(\mathrm m,\;\mathrm n)\:>\mathrm{Threshold}\;\mathrm{value},\\ 0\;\mathrm F\;(\mathrm m,\;\mathrm n)<\mathrm{Threshold}\;\mathrm{value}\\ \mathrm G(\mathrm m,\;\mathrm n)=\mathrm{output}\;\mathrm{image},\;\mathrm F(\mathrm m,\;\mathrm n)=\mathrm{input}\;\mathrm{image}\end{array}$$
Fig. 4
figure 4

Methodology for tumor detection

MRI-BRAST data is divided into two sections: training and testing. The LeNet CNN architecture is loaded and compiled. The network is being trained and serialized network weights can be saved to disc and reused if desired short of having to retrain the network. The extracted output from the fully connected layer shows that the implementation is working properly. The CNN processes the MRI images through convolutional layers and pooling processes that are typically structured so that the spatial resolution of the representations decreases as the number of channels increases. The LeNet CNNs, process the representations stored by the convolutional blocks before producing output using one or more fully connected layers. In the DWT, the MRI images are processed in low and high filter banks using LL, LH, HL, and HH combinations. To simplify the complexity and increase the performance, DWT-based brain tumor region growth segmentation is used. The accuracy and performance parameters in the detection of tumor site in brain MRI images is trained and tested using a probabilistic neural network classifier.

Image segmentation focuses on regions of interest [13] and keeps track of all subsequent modules for analysis. Although segmentation is a critical stage, it is not always necessary when direct classification takes precedence over other goals. The feature extraction process follows image segmentation. The image features are extracted from the initial steps to measure and derive non-redundant features. It gives information about image dimensions and measures full data. The meaning of full tissues and regions are exacted from the image analysis. The morphological operations are used for the same purpose. It is called the image analysis process in which the actual size and shape of the tumor are known to medical practitioners. Finally, the generated features with the smallest dimension are utilized to train a classifier to distinguish between normal and diseased brain tissues. After training, the system performance is assessed using performance metrics.

5 Results & discussions

The level set method works on iterations only to get the tumor results. The watershed algorithm is a pre-segmented and faster method. The simulation is carried out in MATLAB 18.0 using an image-processing toolbox. We have developed our code in MATLAB for all the algorithms. The simulation is performed using 2D HAAR DWT, K-means, Otsu’s, watershed, and level set and CNN methods for 10 images available online in the BRATS 2018 dataset for the brain tumor. Table 2 presents the MATLAB response time of different methods for brain tumor segmentation. The comparative graph of all the techniques is shown in Fig. 5, determined by the MATLAB simulation tool.

Table 2 MATLAB response time of different methods for brain tumor segmentation
Fig. 5
figure 5

Response time comparison graph

The response time is important to model the system for any specific hardware. The response time of the CNN, DWT, K-means, level-set, watershed and Otsu’s methods is 2.519, 2.675, 4.571, 7.290, 9.219, and 12.500 s respectively. The significance of the MATLAB simulation time is that the particular algorithm can predict the estimated delay when the algorithm will be realized in the MATLAB-HDL simulator or Simulink for tumor detection system implementation. The MATLAB simulation predicts the time complexity for the algorithm and estimates the designer to choose the machine learning model [26, 40]. The performance of the algorithms is evaluated based on the performance measures such as precision, recall, F-measure, and accuracy [43]. The computation is done based on standard equations. Table 3 lists the computed values of all the algorithms against each parameter. A confusion matrix is formed that depicted how well the trained model predicted each target class relative to the experimental confusion matrix count.

$$\begin{array}{*{20}c}Recall=\frac{TP}{TP+FN}\\ Precision=\frac{TP}{TP+FP}\\ Fmeasure=\frac{2\times Precision\times Recall}{Precision+Recall}\\ Accuracy=\frac{TP+TN}{TP+TN+FP+FN}\end{array}$$

Where,

TP:

True Positives: The target is positive, as anticipated by the model.

TN:

True Negatives: Occurs when the target is positive while the model predicts it to be negative.

FP:

False Positives: The target is negative, yet the model predicts a positive outcome.

FN:

False Negatives: The target is negative, as anticipated by the model.

Table 3 Comparative performance measures

Figure 6 presents the simulation results of the watershed algorithm, level set method, and intermediate steps with morphological operations, and related operators. Figure 7 shows the pictorial view of results using CNN and other methods. The original tumor image is converted from RGB to gray and histogram equalization is applied to adjust the contrast of the image by changing the intensity values [33]. After that, the noise filtering is used to remove the noise using a Gaussian filter, and the Watershed algorithm is applied. After that, the different mathematical operators and morphological operations are used such as opening, opening by reconstruction, opening-closing, open-close by reconstruction, region maxima, and superimposed maxima to determine the background markers and threshold values. After processing through the modified maximum values, the tumor is detected. In the same way, the level set algorithm is applied in which the original tumor image is converted from RGB to gray and histogram equalization is applied to adjust the contrast of the image by changing the intensity values, then the resultant image is processed for the contour operation for the minimum and maximum contour values. The morphological operations are applied to get the detected tumor.

Fig. 6
figure 6

Brain tumor segmentation using the watershed algorithm and level set methods

Fig. 7
figure 7

Brain tumor segmentation using CNN and other algorithms

6 Conclusion

In the medical field, MRI image segmentation is very tedious work and time-consuming. The paper addressed brain MRI tumor detection and image segmentation using Otsu’s method, level set, watershed, K-means, DWT, and CNN. The tumor detection using all the methods is done successfully in the MATLAB-2018 simulation environment. The response time for Otsu’s method, level set, watershed, K-means, DWT, and CNN is 12.50 s, 9.219 s, 7.280 s, 4.571 s, 2.675 s, and 2.519 s respectively. Otsu’s method consumes more time in comparison to all techniques as it processes the image one time and works on computations of unnecessary regions. On the other hand, the image is pre-segmented in the case of the level set method. The K-means divides the images into clusters and works on average values. The watershed algorithm follows morphological operators, region filtering, and the exact position of the outline. The DWT method divides the image into low and high-frequency bands with four sub-bands as LL, LH, HL, and HH, make easy to decompose a large image. The CNN architecture processed the image using a layered architecture. The algorithms are analyzed based on precision, recall, F-measure, and accuracy as performance measures.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy of Otsu’s method are 0.681, 0.892, 0.773, and 0.714 respectively.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy of the watershed algorithm are 0.782, 0.827, 0.804, and 0.782 respectively.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy of the level set method are 0.775, 0.863, 0.817, and 0.804 respectively.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy of the K-means algorithm are 0.931, 0.764, 0.839, and 0.843 respectively.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy of DWT algorithm are 0.885, 0.867, 0.876, and 0.869 respectively.

  • The computed values of performance measures: recall, precision, F-measure, and accuracy in CNN are 0.869, 0.952, 0.909, and 0.913 respectively.

In comparison to other algorithms for brain tumor application, CNN delivered the best performance based on performance measures and response time. We intend to use CNN with more layers in the future, as well as construct a machine-learning model to offer the same functionality.