Introduction

Since entering the era of big data, the field of artificial intelligence has developed significantly, thanks to the accumulation of data and the improvement of computer performance. Among these advancements, deep learning, as the hottest technology in AI, has achieved great success in the fields of visual recognition, target detection, image fusion, and natural language processing. However, while deep learning-based models can quickly and accurately recognize, segment, and fuse images, they necessitate a large amount of experimental data for training [1]. Meanwhile, humans can swiftly and accurately identify new categories based on a single image alone, a task challenging for all types of deep learning models [2]. To further enhance model performance, the number of network layers in various types of deep learning models has been increasing. However, this escalation in complexity has led to increased training time, human and material costs [3]. To solve this problem, the study of using a small number of samples or even zero samples to train neural networks has attracted extensive attention from researchers [4,5,6,7]. Particularly in the field of medical image recognition, obtaining large datasets similar to ImageNet is challenging due to the private and expensive nature of medical data. Additionally, medical images typically require professional medical personnel for annotation to ensure accuracy and validity, significantly raising the cost of medical image annotation [8]. Therefore, small-sample learning holds broader development potential in the field of medical images.

The brain, one of the most complex organs in humans, has long been afflicted by brain tumor disease. Due to their potential occurrence in any tissue region of the brain and their inherent heterogeneity and low survival rate, brain tumors are considered one of the most deadly cancers [9]. In recent years, the number of brain tumor-related deaths in developed countries has nearly tripled [10]. According to the CBTRUS statistical report study in the United States, it was found that 81,246 people died from brain tumors during 2013–2017 alone [11]. Depending on the type of tumor, physicians can adopt treatment options such as surgery, chemotherapy, and radiation therapy, and appropriate treatment options can greatly increase the chances of survival of brain tumor patients, but these are dependent on accurate identification of the type of brain tumor [12]. Among available diagnostic tools, MRI (Magnetic Resonance Imaging) images are commonly utilized for detecting and classifying brain tumors due to their high resolution and accuracy in brain tissue [13, 14]. However, the detection and classification of brain tumors based on MRI images represent a challenging task that heavily relies on the experience of medical professionals. This process is time-consuming, and misdiagnosis of brain tumors can lead to serious complications, reducing the chances of patient survival [15]. Therefore, the introduction of computer-aided techniques, such as deep learning, to alleviate the challenges of brain tumor diagnosis holds significant practical importance for improving the efficiency of brain tumor diagnosis.

In this context, to address the challenges in the diagnosis process of brain tumors and enhance the efficiency of brain tumor diagnosis, various types of deep learning models are widely employed in the classification and grading of brain tumors. Among the existing studies, many works focus on brain tumor diagnosis using deep learning models such as CNN [16, 17], DCNN(Deep Convolutional Neural Network) [18], CNN–DWT(Discrete Wavelet Transform)–LSTM [19], 3D-CNN(three-dimensional convolutional neural network) [20], SURF(Speeded Up Robust Feature)-SVM [21], GoogleNet [22], ResNet-50 [23,24,25] etc. However, most of the above studies have only experimented on a single MRI dataset, so the generalization performance of these models are need to be validated furtherly. Moreover, most existing works use CNN or transfer learning models that requires a large number of samples for fine-tuning, which will considerably increases the consumption of labor and material costs.

Contrary to the above, our paper proposes a few-shot brain tumor classification model GraphMriNet based on prewitt operator and graph isomorphic network. With the characteristics such as few-shot learning and generalization testing on multiple MRI datasets, the GraphMriNet model exhibits better prediction performance for different brain tumor types. This aspect is crucial for reducing the waste of medical resources, alleviating the workload of brain tumor diagnosis and classification, and increasing the survival probability of tumor patients. The main work of this paper focuses on the following two areas:

  1. (1)

    For the first time, graph isomorphic network is applied to brain tumor detection. In this paper, a brain tumor diagnostic model is established based on prewitt operator and graph isomorphic network, and comparison experiments are conducted with models proposed by existing research and mainstream deep learning models in terms of accuracy, sensitivity and specificity, which proves the superiority of GraphMriNet model.

  2. (2)

    GraphMriNet has high diagnostic accuracy on few-shot datasets with strong generalization ability. The diagnostic accuracies of GraphMriNet on the four datasets of BMIBTD, CE-MRI, BTC-MRI, and FSB were 100%, 100%, 100%, and 99.68%, respectively, which all reached SOTA (state of the art).

Related work

Among the existing studies, there have been numerous efforts to identify and diagnose brain tumors using various types of deep learning models. The majority of these studies rely on MRI imaging techniques, which serve as inputs to deep learning models for training brain tumor diagnosis models through image processing techniques like image enhancement, edge detection, and segmentation. A summary of the relevant studies and their performance is shown in Table 1.

Table 1 Comprehensive comparison of existing related studies

Traditional machine learning algorithms typically involve several key steps including data pre-processing, feature extraction, feature selection, downscaling, and classification. Feature extraction is considered the cornerstone of traditional machine learning algorithms, and the accuracy of the classification model is heavily reliant on the quality of the extracted features. Previous studies have often utilized conventional feature extraction techniques, such as grayscale matrices [26,27,28], directional gradient matrix [29], and genetic algorithm [30, 31], to extract features from MRI images. Subsequently, they utilized classification models like support vector machines and random forests for classification. In recent years, with the rapid development of artificial intelligence, various types of deep learning models have been widely used for brain tumor diagnosis. Compared to traditional feature extraction algorithms, deep learning models can adapt to the training requirements of large datasets and exhibit excellent feature extraction and classification capabilities. Existing research on brain tumor image classification based on deep learning is mainly divided into two models: the adaptive CNN model using convolutional neural networks and the fine-tuned network model based on migration learning.

In terms of adaptive CNN models based on convolutional neural networks, most researchers have designed the network structure based on the dataset features which can adapt to complex data training requirements. Abiwinanda et al. [16] designed a 5-layer CNN network for brain tumor diagnosis based on CE-MRI dataset and achieved 98.51% classification accuracy. Seetha et al. [17] optimized the loss function and model architecture to achieve 97.5% classification accuracy on the BRATS dataset. Hemanth et al. [18] adapted the DCNN to eliminate the backpropagation training process, which greatly accelerated the training time of the model and achieved 96.4% classification accuracy on a private dataset. Kutlu et al. [19] used CNN to extract image features and used a discrete wavelet transform to process the images to classify liver and brain tumors based on DWT-LSTM, achieving a classification accuracy of 98.6% on the CE-MRI dataset. Mzoughi et al. [20] proposed an adaptive enhanced data preprocessing technique and a fully automated 3D CNN architecture, achieving on the Brats-2018 dataset 96.49% classification accuracy on the Brats-2018 dataset, but the model is complex and need a long training time and poor portability. Bhagat et al. [21] used CNN for feature extraction of brain tumor images and selected features using SURF (speeded up robust feature) technique and finally classified the features based on SVM and achieved 99.24% classification accuracy on the CE-MRI dataset.

In terms of fine-tuned network models based on transfer learning, researchers have mostly used existing deep learning models and pre-trained parameters based on 'ImageNet' dataset for brain tumor classification. Deepak et al. [22] used pre-trained GoogleNet to extract features from brain tumor images based on transfer learning on CE-MRI dataset, and achieved 98% classification accuracy. Inar et al. [23] used various deep learning models to classify brain tumor images, which based on pre-trained ResNet-50 model and fine-tuned the model on the BMIBTD dataset, and achieved 97.2% classification accuracy. Saxena et al. [24] used transfer learning for brain tumor classification based on the ResNet-50-network, and achieved the highest classification accuracy of 95%. Ismael et al. [25] used data enhancement techniques as well as pre-trained ResNet-50 to classify the CE-MRI dataset with 99% classification accuracy, but the model was complex and required at least 500 iterations.

In summary, deep learning models have found extensive applications in the field of brain tumor diagnosis, demonstrating commendable classification performance. However, many existing models rely on transfer learning or adaptive CNN approaches, and there is a notable absence of efforts to incorporate graph isomorphic networks into the domain of brain tumor diagnosis. On the one hand, deep learning models based on transfer learning and adaptive CNN typically necessitate a substantial volume of MRI image data for fine-tuning training to achieve high classification accuracy. The high cost associated with acquiring MRI images significantly increases the training time and data acquisition expenses, resulting in substantial resource wastage. On the other hand, the majority of current studies focus on a single dataset, typically involving only a binary or triple classification of MRI brain tumor datasets. This approach overlooks the model's generalization performance across diverse datasets.

In this paper, we propose a few-shot brain tumor classification model GraphMriNet based on graph isomorphic network and Prewitt operator. Achieved 100%, 100%, 100% and 99.68% classification accuracy on four publicly available datasets of BMIBTD, CE-MRI, BTC-MR and FSB, respectively, with high performance and robustness in brain tumor diagnosisy.

The remaining part of this paper is organized as follows. In the section "Related work", we present an overview of the concept of GraphMriNet model, followed by a description of the dataset used in the study and the complete framework for the proposed classification algorithm. The section "Proposed methods" gives details of the experiments performed, presents evaluation results and provides discussions on the results. The section "Results" provides the conclusion of this paper.

Proposed methods

The section "Prewitt operator" of this chapter provides an overview of the principle of the Prewitt filtering algorithm, the section "Graph isomorphism network" introduces the GIN model, and the symbols used in this paper and their interpretation are shown in Table 2.

Table 2 Symbol description table

The GraphMriNet model is shown in Fig. 1, and the GraphMriNet model mainly consists of the Prewitt operator and the GIN model. Initially, the original images filtered by Prewit to get the image edge information. After Prewitt processing, we consider each pixel point with gray intensity value greater than or equal to 128 as a graph node, and the node features include the gray intensity of the corresponding pixel point and two neighboring nodes as edges between them. For each image, a graph is formed, which means fewer nodes are required for the Prewitt-processed image compared to the original image, greatly reducing memory consumption.

Fig. 1
figure 1

Graphmrinet model construction flow chart

The GIN model comprises three graph convolution modules: the GINConv layer, Relu layer, Dropout layer, and Norm layer. The outputs are globally averaged pooled and passed through a linear layer. Finally, the Log Softmax function is applied to obtain the classification results.

Suppose \(x\) is MRI images and \(y\) is output classification results, \(n\) is number of training datasets, \(m\) is number of training images in each training datasets, \(G_{\max }\) is the Prewitt filter function, \(AGGREGATE\) is the aggregation function, \({\text{COMBINE}}\) is the combination function, \(\kappa\) is the readout function, The GraphMriNet model training algorithm is as follows:

figure a

GraphMriNet algorithm

Firstly, all the images in the dataset are traversed for Prewitt filtering to extract the image edge information (lines 1–3), then the Prewitt filtered images are transformed into a graph, and the pixel points with gray value greater than 128 are considered as image nodes, and the rest are considered as edges of the graph (lines 4–8), and then the nodes' first-order aggregated neighborhood features are obtained using the AGGREGATE aggregation function (line 9), and merge the neighbor aggregation features with the current node information through the COMBINE joint function to update the current node information (line 10), and then input it to the READOUT function to get the feature representation of the whole graph (line 11), and finally input the whole graph features to the GIN model for training, and get the final classification results through the logsoftmax function (lines 12–13).

Prewitt operator

Image edges represent the variation of pixel intensity in an image and contain most of the information of the image. The Prewitt operator is commonly used in the field of target detection, mainly for image noise reduction and edge enhancement by bilateral filtering [32]. The Prewitt operator passes through two \(3 \times 3\) filter calculates the gray difference between the pixel point and its surrounding adjacent points in the horizontal and vertical directions of the image, and finally calculates the maximum value of the gray difference sum. Two directions of Prewitt operator \(3 \times 3\) filter matrix and calculation formula are as follows:

$$ H_{x} = \left[ {\begin{array}{*{20}c} { - 1} & 0 & 1 \\ { - 1} & 0 & 1 \\ { - 1} & 0 & 1 \\ \end{array} } \right],\;H_{y} = \left[ {\begin{array}{*{20}c} 1 & 1 & 1 \\ 0 & 0 & 0 \\ { - 1} & { - 1} & { - 1} \\ \end{array} } \right] $$
(1)
$$ \begin{gathered} L_{i,j} = |[f(i - 1,j - 1) + f(i - 1,j) + f(i - 1,j + 1)] \hfill \\ \;\;\;\;\;\;\;\; - [f(i + 1,j - 1) + f(i + 1,j) + f(i + 1,j + 1)]| \hfill \\ \end{gathered} $$
(2)
$$ \begin{gathered} H_{i,j} = |[f(i - 1,j + 1) + f(i,j + 1) + f(i + 1,j + 1)] \hfill \\ \;\;\;\;\;\;\;\; - [f(i - 1,j - 1) + f(i,j - 1) + f(i + 1,j - 1)]| \hfill \\ \end{gathered} $$
(3)
$$ G_{i,j} = L_{i,j} + H_{i,j} $$
(4)
$$ G_{\max } = \max (G_{i,j} ) $$
(5)

where \(L_{i,j}\) is the absolute value of grayscale difference in vertical direction image; \(H_{i,j}\) is the absolute value of grayscale difference in horizontal direction image; \(G_{i,j}\) is the sum of vertical and horizontal grayscale differences; \(G_{\max }\) is the maximum value of all grayscale differences of the image. The Prewitt filter algorithm processes the picture flowchart as shown in Fig. 2:

Fig. 2
figure 2

Flowchart of the Prewitt filtering algorithm

The Prewitt operator exhibits excellent performance in detecting image edges. Figure 3 compares three commonly used filtering algorithms on the target dataset. Specifically, the Canny operator effectively filters noise and demonstrates strong edge detection abilities, but it tends to prioritize edges, often overlooking finer details. The Laplacian operator has weaker noise suppression ability, and the processed images have more noise and weaker detection ability for edges. Compared with the above two algorithms, the Prewitt operator has excellent edge detection ability and noise suppression ability, and the image after Prewitt processing has obvious edges and outstanding details, which can better reflect the focus of the image. Consequently, this paper opts for the Prewitt operator to extract image information for use as the data input in the graph isomorphic network.

Fig. 3
figure 3

Comparison of three filtering algorithms

Graph isomorphism network

GIN is a simple GNN-based graph neural network structure proposed by Xu et al. [33]. A graph \(G\) can be described by set of components, nodes \(V\) and edge \(E\) as \(G = (V,E)\). In performing graph classification task, we get a collection of graphs \(\{ G_{1} ,...,G_{N} \} \in \rho\), where \(\{ Y_{1} ,...,Y_{N} \} \in \omega\) is labels. The goal of GNN is to take graph structure data as input and learn a vectors \(h_{G}\) to predict the category of the graph \(Y_{G} = \chi (h_{G} )\). GNN can follows a adjacent aggregation strategy, they iteratively update the representation of a node by aggregating the representations of its neighbors [34]. The structural information within the \(k - hop\) neighbors can be captured after \(k\) aggregation iterations and can be expressed as follows:

$$ a_{v}^{(k)} = AGGREGATE^{(k)} (\{ h_{u}^{(k - 1)} :u \in\upomega (v)\} ) $$
(6)
$$ h_{u}^{(k)} = COMBINE^{(k)} (h_{u}^{(k - 1)} ,a_{u}^{(k)} ) $$
(7)

where \(h_{u}^{(k)}\) denotes the vector representation of the node at the \(k\) level. The AGGREGATE function represents the aggregation of the node's neighborhood features together; the COMBINE function combines the node feature vectors of the previous layer with the feature vectors of this layer to obtain the final feature vector combination of the nodes. For graph classification, all node features need to be converted to a vector representation of the entire graph by the READOUT function:

$$ h_{G} = READOUT(\{ h_{u}^{(K)} ,u \in G\} ) $$
(8)

The READOUT function represents a summation, averaging or maximum substitution invariant function.

When performing graph classification tasks, a problem of determining graph isomorphism needs to be solved. The WL (Weisfeiler–Lehman) test is very useful in dealing with graph isomorphism problems. Keyulu et al. [28] rigorously proved that the upper limit of GNN networks is the WL test, and used it to propose a graph isomorphic network (GIN) with the same powerful discriminative and expressive power as WL. Isomorphism Network (GIN) with the same powerful discriminative and expressive power as WL.

As shown in Fig. 4, the graph nodes are subjected to two WL iterations, then the label of blue node 1 can be represented by a 2-level subtree with yellow node as the root. If the aggregation function of the GIN model captures all nodes of the neighbors, then GIN can recursively capture all subtrees of the root to get a different representation of each node as a way to determine whether it is an isomorphic graph.

Fig. 4
figure 4

WL test iteration

The GIN model takes the one-hot encoding of nodes as input, and iteratively updates the node feature vector representation in each layer. The iterative process is shown in Fig. 5, where each node feature in the GIN layer is used as input to the nodes in the next GIN layer, and the aggregation of node features is performed first, and then the aggregated node feature vector is input to the multilayer perceptron:

$$ h_{u}^{(k + 1)} = MLP((1 + \varepsilon^{(k)} )h_{u}^{(k)} + \sum\limits_{v \in N(u)} {h_{v}^{(k)} } ) $$
(9)

where \(\varepsilon^{(k)}\) is the variable learning parameters. By iterating the node features of the previous layer and inputting them into the Multiple-layer Perceptron, the node layer feature vector is obtained. Subsequently, the node feature vectors are fed into the READOUT function to generate the vector representation of the entire graph.

Fig. 5
figure 5

GIN model node update process

Results

Datasets

In order to verify the effectiveness and robustness of the GraphMriNet model proposed in this paper, experimental validation is conducted on four publicly available datasets:

  1. (1)

    BMIBTD [35]: The dataset includes 155 brain tumor slices and 98 normal brain slices. In this dataset, 2D slices of various brain tumor types were included, and the remaining slices without tumor lesions were included in the normal category.

  2. (2)

    CE-MRI [36]: The dataset was collected from Guangzhou Southern Medical University and Tianjin Medical University between 2005 and 2010, and contained three brain tumor types, namely glioma, meningioma and pituitary tumor, from 233 patients, with a total of 3064 images. To verify the training of the model with small samples, 200 images of each of the three types of brain tumors were randomly selected and input into the model for training.

  3. (3)

    BTC-MRI [37]: The data set includes four types of glioma, meningioma, pituitary tumor and normal, with a total of 3264 images. 300 images of each type are randomly selected and input to the model for training.

  4. (4)

    FSB [38]: The dataset was assembled from Br35H, SARTAJ and other datasets, including four types of glioma, meningioma, pituitary tumor and normal. To validate the classification effect of the model in the case of small and unbalanced samples, 15, 70, 300 and 140 sheets were selected for each of the glioma, meningioma, pituitary tumor and normal categories, respectively. The distribution of the above four dataset categories is shown in Table 3:

Table 3 Distribution of datasets

Generally speaking, the GIN model does not require input pixel normalization. Therefore, in this paper, the images are neither normalized nor standardized when fed into the GIN model. In the performance comparison analysis of mainstream models, in order to speed up the training, the above images are set to \(128 \times 128\) pixels to input various mainstream deep learning models for training, and the target data set is divided into 80% of the training set and 20% of the testing set, and then 20% of the training set is divided into the validation set for model accuracy verification during training.

Implementation

In this paper, 3.8.3 is the experimental Python version, 0.24.2 is the scikit-learn version, 2.4.1 is the TensorFlow version, the detailed experimental configuration environment is shown in Table 4:

Table 4 Experimental software and hardware configuration environment

In this paper, we use accuracy, sensitivity, specificity, F1-score and ROC curve as the performance evaluation indexes of the model, which are calculated as shown in Eqs. 1013, respectively:

$$ Accuracy = \frac{TP + TN}{{TP + FP + FN + TN}} $$
(10)
$$ Sensitivity = Recall = \frac{TP}{{TP + FN}} $$
(11)
$$ Specificity = \frac{TN}{{TN + FP}} $$
(12)
$$ F1 - {\text{score}} = \frac{2 \cdot Precison \cdot Recall}{{Precison + Recall}} $$
(13)

Accuracy indicates the percentage of all correctly classified samples to all samples. Sensitivity, which can also be called Recall, indicates the percentage of correctly classified positive class samples to all positive class samples. Specificity indicates the percentage of correctly classified negative class samples in all negative class samples. The F1-score is the summed mean of the precision and recall rates. ROC curve (Receiver Operating Characteristic Curve), i.e. the working characteristic curve of the subject, shows the trade-off between the true case rate (TPR) and the false positive case rate (FPR).

Comparison with existing work

To evaluate the effectiveness of the GraphMriNet model, experiments were conducted on four publicly available datasets, BMIBTD, CE-MRI, BTC-MRI and FSB, and all of them used Adam as the optimizer, with the learning rate set to 0.0001, 10 epochs were trained each time, and subjected the model to a fivefold cross-validation, with reported results representing the averages across the folds. Table 5 shows the performance of GraphMriNet model on the test sets of four publicly available datasets, BMIBTD, CE-MRI, BTC-MRI and FSB, and its comparison with the existing research work; Fig. 6 shows the training loss (Loss), different sample ratios and their accuracy and ROC curves of GraphMriNet model.

Table 5 Performance of GraphMriNet on four datasets and its comparison with existing studies
Fig. 6
figure 6

Model performance indicators

From Table 5, it is observed that the GraphMriNet model, based on graph isomorphism with Prewitt filtering algorithm, achieves perfect metrics of accuracy, sensitivity, specificity, and F1 value reaching 1 on BMIBTD, CE-MRI, and BTC-MRI datasets. On the FSB dataset, these metrics have slightly decreased to 99.68%, 98.44%, 98.62%, and 98.72%, respectively. Overall, the diagnostic accuracy of the GraphMriNet model surpasses all types of deep learning models proposed in the four datasets mentioned above, demonstrating excellent brain tumor diagnostic performance. Given that the GIN model represents the theoretical upper limit of the GNN model and possesses the same powerful discriminative power as the WL test, it proves useful in solving the graph isomorphism problem. The GIN model can map isomorphic graphs to the same vector representation and non-isomorphic graphs to different vector representations for classification. The GraphMriNet model proposed in this paper preprocesses images using the Pretwitt operator and combines it with the GIN model to accurately understand the input graph structure. As a result, it starts with low training loss, enabling high training accuracy with fewer iterations. As shown in Fig. 6, the GraphMriNet model reduces the loss to almost zero after two iterations, while maintaining consistent classification accuracy. Additionally, changing the sample ratio between the training set and the test set illustrates the model's performance with different training sample sizes and test samples. It is evident that the GraphMriNet model possesses strong classification ability with robustness under any training sample size. The ROC curve reflects the model's trend in terms of its true positive rate and false positive rate at different thresholds. The larger the area under the curve, the better the classification performance of the model. From the ROC curves in Fig. 5, it can be seen that the GraphMriNet model exhibits excellent classification performance under different thresholds.

Performing classification tasks on small and unbalanced datasets poses a challenge for deep learning models. To assess the classification ability of the GraphMriNet model with small and unbalanced samples, 15, 70, 300, and 140 instances were selected for each of the glioma, meningioma, pituitary tumor, and normal categories on the FSB dataset, resulting in a 20-fold difference in the number between the two largest categories. At this point, the GraphMriNet model showed a certain degree of performance degradation, and the accuracy of the model on the test set fluctuated significantly when training with different ratio samples, which was due to the small number of glioma categories, and if the category was not selected or selected in small numbers when model training was performed, the model could not make effective distinctions for the category.

The major differences between the GraphMriNet model and the models proposed in the existing studies are:

  1. (1)

    Due to the limitation of the sample size of the dataset, to achieve better classification performance, most existing studies adopt a transfer learning model. They use pre-trained 'ImageNet' parameters and subsequently fine-tune them on the sample dataset to obtain the classification model. In the field of brain tumor classification, the model parameters may not converge easily, and the classification accuracy may be unsatisfactory when applying model transfer, due to the significant differences between brain tumor datasets and the ImageNet dataset. In contrast, the GraphMriNet model benefits from the powerful isomorphic graph classification ability of the GIN model, which is insensitive to sample size. In the case of a small dataset, with guaranteed avoidance of overfitting, the GraphMriNet model also surpasses the training effect of existing studies on larger datasets and exhibits strong generalization ability.

  2. (2)

    To minimize model loss, traditional neural network models often necessitate multiple training iterations, significantly increasing the model's training time and cost. In contrast, the GraphMriNet model incorporates the Prewitt filtering algorithm for edge detection and the extraction of key information from brain tumor images. By disregarding redundant image information, it significantly reduces the number of model training iterations, thereby enhancing the training efficiency of the model.

Overall, the GraphMriNet model has excellent classification performance on the above four publicly available datasets, and still has strong classification accuracy in some extreme cases, which can accurately classify and diagnose various types of brain tumors in few-shot scenario.

Performance comparison with classical deep learning models

To further validate the superior performance of the GraphMriNet model, its performance was compared and analyzed with seven mainstream deep learning models on three publicly available datasets, BMBTD, CE-MRI, and BTC-MRI, each using pre-training weight parameters on the large dataset ImageNet, setting 'Adam ' as the optimization algorithm, and the learning rate was all set to 0.0001 with 50 training iterations. To verify the classification ability of each model under small samples, none of them used any data enhancement techniques. The relevant metrics are shown in Table 6.

Table 6 Performance comparison of different models on three datasets

As seen in Table 6 and Fig. 7, the GraphMriNet model achieves a score of 100% on the three datasets of BMBTD, CE-MRI, and BTC-MRI in terms of accuracy, sensitivity, and specificity, all of which outperform the mainstream deep learning models.

Fig. 7
figure 7

Performance comparison of mainstream models

On the BMIBTD dataset, EfficientNetB7 outperforms other models with an accuracy, sensitivity, and specificity of 88%, 80.70%, and 53.80%, respectively, while the remaining models exhibit poor performance. This can be attributed to the limited sample size of the BMIBTD dataset, involving only 253 images in the training process. In general, mainstream deep learning models require more samples, and if the training data is limited, the model's classification ability often diminishes due to its inability to effectively learn image features.

For the CE-MRI dataset, where the sample size increased to 600, models such as Xception, DenseNet-201, and InceptionV3 performed better. Xception and InceptionV3 have modified the deep network structure of ResNet, focusing on extracting more information with a restricted network depth. Xception has pushed the InceptionV3 principle to its extreme, endowing it with robust feature extraction capabilities. Despite accurately diagnosing three different brain tumor types with a small sample size, Xception's relevant indicators still do not meet the requirements for clinical application.

On the BTC-MRI dataset, with the sample size further increased to 1200 and the model tasked with discriminating between four brain tumor types, the DenseNet-201 model exhibited superior performance. All three metrics fell within the [0.95, 0.99] interval. This is attributed to the unique network architecture of DenseNet, which connects each layer as input to subsequent layers in a pre-feedback manner. This connection pattern alters the traditional neural network with a single connection, enhancing feature transfer between layers and utilizing features more effectively.

Experiments on the above three publicly available datasets demonstrate that the GraphMriNet model has excellent diagnostic capability for brain tumors in the case of small samples. The classification performance of the GraphMriNet model is improved by about 13.6–56.3%, 3.4–300% and 4.8–103.1% on the BMIBTD, CE-MRI and BTC-MRI datasets, respectively, compared with the remaining mainstream models.

The GraphMriNet model utilizes the Prewitt filtering algorithm to process image edges for extracting crucial information. This aids in sharpening the model's focus on lesion points, enhancing its classification ability, and reducing the overall training time. Subsequently, the Prewitt-processed edge images are transformed into graphs and fed into the GIN model for training. Leveraging the potent isomorphic image classification capabilities of the GIN model and appropriate architectural settings, GraphMriNet accurately and efficiently diagnoses various classes of brain tumors.

Comparison of performance of applied Prewitt operator and discussion of model parameters

In the GraphMriNet model, the application of the Prewitt operator not only improves the diagnostic accuracy, but also greatly reduces the number of training iterations In order to prove the effectiveness of the Prewitt operator, using four mainstream models to experimentally validate the original image and the edge image after the Prewitt filtering process.

As shown in Table 7, almost all of the model accuracies improved to some extent after the Prewitt filtering process. After applying the Prewitt operator on the BMIBTD, CE-MRI, and BTC-MRI datasets, the model accuracy increased on average by about 12.7%, 8.1%, and 8.2%, respectively, but the Xception model had a more significant decrease in accuracy on the Prewitt-treated BTC-MRI dataset, due to the fact that Xception for Pituitary class could not make a valid diagnosis.

Table 7 Performance of Prewitt method in three datasets

In general, the Prewitt operator has good edge detection capability, which can improve the accuracy and reduce the number of training iterations by the same effect as the attention mechanism, and can improve the diagnostic accuracy of mainstream deep learning models on the relevant dataset.

Table 8 illustrates the algorithmic processing time of Xception, VGG-16, DenseNet-201, and EfficientNetB7 models before and after applying the Prewitt algorithm on three datasets. It can be observed that, after Prewitt algorithm processing, the algorithmic processing time of the models decreases to varying degrees. On the BMIBTD dataset, compared to the original images, the average reduction in processing time after Prewitt algorithm application is 15.8%. On the CE-MRI dataset, there is an average reduction of 8.3%, and on the BTC-MRI dataset, there is an average reduction of 3.3%. The reason is that compared with the original image, the Prewitt-processed image retains the key information of the image while removing the invalid information of the image, which greatly reduces the training time of the model, which has a very important impact on improving the diagnostic efficiency of the model.

Table 8 Model time consumption

In the GIN model, the selection of the READOUT function has a large impact on the model, and the READOUT function obtains the feature representation of the whole graph by aggregating all node features. The commonly used READOUT functions include summation (SUM), average (AVG) and maximum (MAX), and Table 8 shows the classification accuracy when different READOUT functions are used on the FSB dataset.

As evident from Table 9, the classification accuracy achieved using the summation function surpasses the other two approaches, deviating from the findings of XU et al. [33] in their paper, where they conclude that the expressiveness of summation is superior to that of averaging and maximization. In general, the summation function captures both the labels and quantities of the graph, emphasizing precise graph information. On the other hand, the averaging function tends to prioritize learning the distribution information of the graph nodes, while the maximization function is biased towards learning information from representative elements while overlooking diversity [33]. In brain structure diagrams, brain tumors typically manifest as larger regions (or shaded areas), reflecting similarity in node features and presenting a consistent distribution. This aligns with the feature that the averaging function is inclined to learn the distribution information of diagram nodes, resulting in outcomes different from those proposed by XU et al. Therefore, in this paper, we opt for averaging as the READOUT function of the GIN model.

Table 9 The classification accuracy with different types of READOUT function

Conclusions

In recent years, the global incidence of brain tumors has been on the rise, accompanied by a low cure rate, making it a significant ailment affecting humanity [39]. To enhance the diagnostic accuracy and streamline the efficiency of brain tumor diagnosis, machine learning algorithms are being employed for brain tumor classification, holding substantial practical importance in aiding medical professionals in rendering precise diagnoses. In this paper, we propose the GraphMriNet brain tumor diagnosis model based on the Prewitt operator and graph isomorphic network. The method utilizes the Prewitt operator to extract edges from brain tumor images, highlighting crucial image information. Subsequently, pixel nodes with a gray intensity greater than 128 are considered as graph nodes, while the remaining pixel points are treated as the edges of the graph. Finally, this information is fed into the GIN model for training. Experimental results demonstrate that the GraphMriNet model achieves classification accuracies of 100%, 100%, 100%, and 99.68% on four publicly available datasets: BMIBTD, CE-MRI, BTC-MRI, and FSB, respectively. Notably, the diagnostic accuracy is improved by 0.8–5.3% relative to existing studies, enabling precise diagnosis of various brain tumor types. Moreover, the model achieves lower training loss and optimal diagnostic accuracy with fewer iterations. With its superior performance, GraphMriNet effectively enhances the diagnostic efficiency of brain tumors in cases with small sample sizes, providing crucial clinical guidance to assist doctors in making accurate medical decisions.

In future practical clinical application scenarios, on the one hand, it is necessary to optimize the content of error and deviation analysis according to the real medical application scenarios to further enhance the model diagnostic reliability; on the other hand, this paper will focus on the lightweight of the model, so that it can be run on mobile devices or can be diagnosed remotely, in order to better provide doctors with guidance for clinical decision-making.