1 Introduction

White blood cells are important building blocks of the immune system that help fight infection and protect the body against foreign substances such as viruses. Diagnosis of diseases such as blood cancer, AIDS from WBCs is important for hematologists. Separating WBC cells into subtypes is difficult due to differences in cell shape in images during maturation. To overcome this problem, machine learning and new generation CNN structures have been proposed [1,2,3,4,5]. Macawile et. al proposed a method that can segment cells from microscopic blood images. The proposed method is based on CNN, which can classify, monocytes, neutrophils, lymphocytes, basophils and eosinophils from a microscopic blood image of Hue Saturation Value [6]. Sahlol et al. proposed an advanced hybrid approach to the effective classification of Leukemia. Firstly, features were extracted from WBC images using VGGNet. Secondly, the obtained features are filtered with the Salp Swarm Algorithm. The proposed approach has been applied to two general WBC reference datasets. Accuracy and reduced computational complexity were achieved according to the obtained results. [7]. Ramesh et al. proposed a classify framework based on color information and morphology. The performance of the algorithm was evaluated by comparing the visual classification of the hematopatologist. The algorithm was applied to the 1938 sub-image of WBCs, of which 1804 were correctly classified. Later, in the two-stage classification, WBCs were classified into cells with broadly segmented nuclei and non-segmented nuclei. In the second stage, Feature description has been made for classifying WBCs by linear discriminant analysis. System evaluation was made using k-fold cross validation technique. An overall accuracy of 93.9% was determined in the five subtype classifications of the applied two-stage classification [8]. Su et al. proposed a new algorithm for segmentation of WBCs from smear images. The main idea of the proposed algorithm is to find a distinctive region of WBCs in the HSI (Hue Saturation Intensity) color space. Three types of properties (i.e. geometric properties, color properties and LDP-based tissue properties) have been extracted and given to three different neural networks to recognize the types of WBCs. A total of 450 WBC images were used to test the effectiveness of the proposed WBC classification system. The highest accuracy rate was 99.11% [9]. Kutlu et al. in his studies, blood cells were classified with Regional Based Evolutionary Neural Networks. The proposed architectures have been trained and tested by combining the BCCD and the LISC (Leukocyte Images for Segmentation and Classification) data set. Classification was implemented by using AlexNet, VGG16, GoogLeNet, ResNet50 architectures. the proposed system showed 100% success in identifying WBCs. Lymphocyte cell types were determined with 99.52% accuracy rate, 98.40% accuracy rate, Monocyte, 98.48% accuracy with Basophil, 96.16% accuracy with Eosinophil and 95.04% accuracy with Neutrophil in Resnet50 architecture[10]. Barrero et al. developed a system to classify and identify blood cells using networks of Gauss Radial Base Functions (RBFN). While it is generally 97.9% accurate in the classification of WBCs, the sensitivity in classification by cell type is 93.4% for lymphocytes, 79.5% for neutrophils, 97.37% for monocytes, 73.07% for eosinophils and 100% for basophils according to professionals [11]. Habibzadeh et. at. examined the classification of WBCs according to four major types, including Eosinophils, Neutrophils, Lymphocytes and Monocytes, using the Deep Learning. After the preprocessing phase, WBC recognition was realized with hierarchy topological feature extraction by means of ResNet and Inception architectures. For training and testing were used 11,200 and 1244 images respectively. ResNet50 detected an average of 100% of the four main WBC types, while promising results were obtained with the accuracy rate of 99.84% and 99.46% obtained with ResNet152 and ResNet101. Other statistical confusion matrix tests revealed that this study reached precision values ​​of 1.0, 0.9979, 0.9989 if the Area Under the Curve (AUC) exceeds 1.0, 0.9992, 0.9833 in the three proposed techniques [12]. Zhao et al. proposed an automatic detection and classification system for WBCs from blood images. First of all, an algorithm was developed to detect WBCs from microscope images based on their simple relationship with the morphological study of R, B colors. Next, a granularity feature (Bidirectional Invariant Local Formation Local Pattern, Pairwise rotation invariant co-occurrence local binary pattern feature) and SVM (Support Vector Machine) were first applied to classify eosinophils and basophils from other WBCs. Finally, CNNs are used to automatically extract high-level features from WBCs, and random forest algorithm was applied to obtained features to recognize the type of WBCs. Experiments on the ALL-IDB(Acute Lymphoblastic Leukemia Image Database for Image Processing) and Cellavison database have been shown to have a better effect than the iterative threshold method of the proposed method [13]. Ruberto et al. proposed a new method to recognize WBCs from microscopic blood images. Images are classified as healthy or influenced by leukemia. The proposed system has been tested in general data sets for leukemia detection such as SMC-IDB, IUMS-IDB databases. The results were promising, but 100% accuracy for the first two data sets and 99.7% for ALL-IDB in detection of white cells and 94.1% in leukemia classification [14]. Baydili et. al classified WBC images into five categories through capsule networks, a new method of deep learning. The results obtained with the model were compared with the most known deep learning methods and a high accuracy was obtained in the test data (96.86%) [15]. Gupta et al. proposed the Optimized Binary Bat Algorithm (OBBA) for the classification of different types of leukocytes. An optimized algorithm is used to obtain a subset of these features by removing a number of features from the images of the WBCs. The proposed algorithm was implemented using four different classifiers using the k-NN (k-Nearest Neighbor), Logistic Regression, Random Forest, and Decision Tree, and their performance was compared. The proposed OBBA classifies WBCs with an average sensitivity of 97.3% [16]. Shahin et al. proposed a new identification system for WBCs based on CNN. In addition, a new end-to-end evolutionary deep architecture called “WBCsNet” has been developed. As a result of tests performed on three different general WBC datasets (2551 images), accuracy of 96.1% was obtained with the proposed WBCsNet [17]. Togaçar et al. focused on classifying WBC images using the CNN models. Various classifiers have been used on properties derived from AlexNet architecture to evaluate classification performance. The best performance was obtained by the Quadratic Discriminant Analysis classifier with an accuracy of 97.78% [18]. Hedge et al. proposed a classifier that can detect abnormal cells as well as white blood cells. In the study, traditional image processing approach and deep learning methods were compared for classification of WBCs. An accuracy of around 99% was achieved for CNN [19]. Malkawi et al. using a hybrid system using CNN and different machine learning algorithms (SVM, KNN and Random Forest), they classified WBC cells with 98.7% accuracy[20]. Rezatofighi et al. proposed image processing algorithms to automatically recognize WBC cells. Using a two-step process, the cell nucleus and cytoplasm were segmented primarily based on Gram-Schmidt orthogonalization. Then, various features were extracted from the segmented regions, and the classification was performed with the Artificial Neural Network and SVM [21]. Pinyakupt et al. performed both segmentation and classification of five types of WBC cells using linear and Naive Bayes algorithms. The proposed system consists of preprocessing step, nuclei segmentation, cell segmentation, feature extraction, feature selection and classification processes. The accuracy obtained with Linear and Naive Bayes classifiers was 98% and 94% respectively [22]. Sarrafzadah et al. used extracted features from both the nucleus and the cytoplasm, in addition to the properties doctors used when classifying WBC cells. In the method using SVM algorithm, the classification was achieved with 93% accuracy [23].

Algudah et al. extracted three distinct properties from WBCs: morphological, statistical, and textural. Principal component analysis (PCA) was used to determine the order of the extracted features. Classification was carried out with probabilistic neural network (PNN) and support vector machine and random Forest Tree. The accuracy obtained was 99.6% [24].

Today, in the age of modernization, there is a great deal of research in the field of image processing combined with various segmentation and classification techniques to produce alternatives for WBC classification and counting. In these studies, the previous identification systems for the classification of WBCs consist of preprocessing, partitioning, feature extraction, and feature selection steps. The accuracy of these existing methods is still improved. It is a real need to use deep learning methodologies to improve the performance of the identification systems of previous WBCs.

The purpose of this article is to develop a system for the identification and classification of WBCs, using image processing techniques to support the doctor in the diagnostic process, and to reduce the subjective errors in manual analysis. Therefore, we proposed the Alexnet-Googlenet-SVM hybrid CNN method, which can classify various types of WBC. The proposed method includes pre-processing, filtering, feature extraction and classification processes as a whole. The system uses Alexnet and Googlenet architectures, which are pretrained models. WBC image feature vectors in the last pooling layer of these models were combined and classification was carried out with the help of SVM algorithm.

The contribution of paper is as follows:

  • Using 2 different data sets, Eosinophil, Lymphocyte, Monocyte and Neutrophil WBC images were classified by Alexnet-Googlenet-SVM hybrid CNN method.

  • The new feature vector was obtained by concatination by taking the maximum values of Alexnet and Googlenet's feature vectors.

  • The feature vectors obtained were classified with SVM.

  • Considering both the literature and the results obtained, WBC classification was performed with high accuracy with the Hybrid CNN model.

The rest of the article is organized as follows. Data and properties used for classification are presented in Sect. 2. The structure of the method proposed in Sect. 3 is detailed. The results and discussion obtained in Sect. 4 are presented. In the last section, a brief evaluation of the proposed method is given.

2 Data sets

In this article, two different data sets from the Kaggle website and LISC database are classified [25, 26]. The data set from the Kaggle website contains 12,500 enhanced blood cell JPEG images with cell-type tags (CSV). There are approximately 3000 images for each of the 4 different cell types grouped in 4 different folders. Cell types are Eosinophil, Lymphocyte, Monocyte and Neutrophil. For each class, 248 were used for a total of 992 image tests. Figure 1 shows an example image for each class.

Fig. 1
figure 1

Images of eosinophil, lymphocyte, monocyte and neutrophil, respectively

In the LISC dataset, Samples have been taken from peripheral blood of 8 normal subjects and 400 samples have been obtained from 100 microscope slides. these images have been recorded by a digital camera and have been saved in the BMP format. The size of the images is 720 × 576 pixels.

Color images have been collected from hematology-Oncology and BMT Research Center of Imam Khomeini hospital in Tehran, Iran. The images were classified by a hematologist into normal leukocytes: eosinophil, basophil, monocyte, lymphocyte, and neutrophil. In this article, a total of 189 eosinophil, lymphocyte, monocyte, and neutrophil WBCs images were used for classification. By applying augmentation to these images, the number of data was 99 for each class. The applied augmentation method is rotation. Images were subjected to 90, 180 and 270 degrees of rotation.

3 The proposed methot

Convolutional Neural Networks (CNN) such as the first developed Alexnet [27] and Googlenet[28] have been successfully applied to problems such as medical classification [29,30,31]. Alexnet and Googlenet based hybrid CNN model, which are pretrained models for classification of WBC images, are as in Fig. 2. The model takes WBC images as input. Convolution, normalization and pooling layers are applied on each image. Thanks to these steps, feature vectors of each image are obtained. These feature vectors obtained in pretrained models are classified by means of softmax layer. In the proposed model, a single feature vector was obtained by combining feature vectors in the last pooling layers of Pretrained models. The resulting feature vector is classified by SVM algorithm instead of softmax. The proposed hybrid Alexnet-Googlenet-SVM model base layer properties are described below.

Fig. 2
figure 2

The proposed hybrid Alexnet-Googlenet-SVM model

Convolution: This layer, which is responsible for perceiving the properties of the image, is the main structure of CNN. It is applied to eliminate some features that do not need to be trained in images. In the Convolution layer, k filtering is applied to WxYxD size data, where k is filter size, W, Y, and D are width, heigth and depth of input image. The width and depth of the new data to be obtained by applying the filter are calculated according to Eqs. 1 and 2.

$${\text{Output}}\;{\text{Width}} = \left( {\frac{W - F_{w} + 2P}{S_{w} }} \right) + 1$$
(1)
$${\text{Output}}\;{\text{Height}} = \left( {\frac{W - F_{h} + 2P}{S_{h} }} \right) + 1$$
(2)

where, Fh, Fw, Sw, Sh and p show Filter height and width, Stride width and height, Padding respectively.

Activation Layer: This layer, which comes after all Convolutional layers, is used to remove the linearity in the image. In this layer, the Rectifier (ReLu) function is often used because of its speed advantage (Eq. 3).

$$Relu:f\left( x \right) = \left\{ {\begin{array}{*{20}c} {0,x < 0} \\ {x,x \ge 0} \\ \end{array} } \right.,\;f\left( x \right)^{\prime } = \left\{ {\begin{array}{*{20}c} {0,x < 0} \\ {1,x \ge 0} \\ \end{array} } \right.$$
(3)

Pooling layer: This layer is a layer that is frequently added between convolutional layers. The task of this layer is to reduce the shift size in the image and the parameters and calculations in the network. There are many pooling operations used in the literature such as average pooling, max pooling and L2-norm pooling. Equation 4 is used to obtain the properties in this layer.

$$OM = \left( {\frac{IM + 2P - F}{S}} \right) + 1$$
(4)

where, IM, OM and S are Input Matrix, Output Matrix and Stride, respectively.

Normalization: Normalizes the output produced by the convolution and fully connected layers to improve the training time of the network. Equation 5 shows the normalization equation.

$$Y_{i} = \frac{Xi - \mu_{\beta } }{\sqrt {\sigma_{\beta }^{2} + \varepsilon } }$$
(5)

Depending on Eq. 5, σβ and µβ are calculated as in Eqs. 6 and 7.

$$\sigma_{\beta } = \frac{1}{M}\mathop \sum \limits_{i = 1}^{M} \left( {X_{i} - \mu_{\beta } } \right)^{2}$$
(6)
$$\mu_{\beta } = \frac{1}{M}\mathop \sum \limits_{i = 1}^{M} X_{i}$$
(7)

where M, µβ, σβ and Yi are respectively, number of input data, average ans standard deviation of the stack, new values resulting from normalization process.

Inception module: Each module consists of different sized convolution and max-pooling processes. The expansion effect in the Inception modules is created by parallel execution of 1 × 1, 3 × 3, 5 × 5 filters and 3 × 3 maximum sharing in the convolution layers. The purpose of this layer is to optimize the processing load. Figure 3 Inception 3a shows the internal architecture of the module.

Fig. 3
figure 3

Inception 3a Module architecture

Concat: The feature vectors of the Pool5 and Average_Pool layers of the Alexnet and Googlenet architectures used in the proposed model are 9216 and 1024 for each image. This layer performs the merge of feature vectors. To eliminate the size difference between the two feature vectors, the padding process is applied to the feature vector in the Googlenet Average_Pool layer and the size is equalized to the Alexnet feature vector. Thus, a single feature vector was obtained by combining two different feature vectors of a single image. The feature vector representing the image from these feature vectors was obtained by applying maximization. Figure 4 shows the application of the maximization process to the Alexnet and Googlnet feature vectors.

Fig. 4
figure 4

Concat function

4 Experimental results and discussion

A computer with Intel core i7-9750H processor and 8 GB RAM was used in the tests. When using Transfer Learning for Pretrained Alexnet and Googlenet, application codes for the Alexnet-Googlenet-SVM hybrid CNN method are written in Matlab R2019a.

In order to demonstrate the success of the proposed model, WBC images were first classified with pretrained models, Alexnet and Googlenet. Secondly, the images are classified with the Alexnet-Googlenet-SVM model. To test the performance of the proposed approach, Sensitivity, Accuracy, Precision, F1-Score, AUC parameters are used. Table 1 gives the confusion matrix parameter definitions and metrics.

Table 1 Confusion matrix parameters and metrics

Firstly, classification was made with pretrained Alexnet and Googlenet architecture. Table 2 shows the values of performance parameters obtained with Alexnet and Googlenet for two data sets. In the classification made with Alexnet for Kaggle data set, 30 images from eosinophil class, 41 from Neutrophil class and 2 from Monocytle class, 73 images in total were misclassified. Overal accuracy 92.64% was achieved. Overall accuracy 95.74% was obtained with Googlenet, where relatively better accuracy results were obtained. With this architecture, 32 images from the Eosinophil class, 1 from the Lynphocyte class and 5 from the Monocytle class, 38 images in total were misclassified. When the results obtained for the LISC data set are examined, overall accuracy with Googlenet is 96.47%. A total of 14 images from the Eonishopil class and 2 from the Lymphocyte class and 11 from the Monocyte class were classified incorrectly. In the classification made with Alexnet, 13 images from the Eonishopil class and 3 from the Lymphocyte class and 1 from the Monocyte class were classified incorrectly. Overall accuracy is 96.72%. According to Table 2, similar results were obtained for both data sets with Alexnet and Googlenet.

Table 2 Pretrained Alexnet and Googlenet performance parameters

According to Table 2, considering the overall accuracy and F1 score, the results are likely to be improved. With this target, WBC images are classified with the Alexnet-Googlenet-SVM hybrid CNN model. The biggest factor in the accuracy of the model, Alexnetin Pool5, is the Concat layer that combines the feature vectors in the Avg_pool layers of the Google. For the feature vector to be obtained in the Concat layer, the maximum, minimum and average functions are applied to the feature vectors in the Pool5 and Avg_pool layers. The feature vectors obtained are given as an input to the SVM algorithm. SVM places the attributes from each data image on the coordinate plane. Then the classification is done by finding the hyper-plane that best separates the classes. Classification is made by taking k-fold cros validation 10 in SVM algorithm. The highest performance was achieved with the maximum function. Table 3 shows the classification performance parameters obtained with the Alexnet-Googlenet-SVM model for both data sets.

Table 3 Alexnet-Googlenet-SVM classification results

For the Kaggle data set, 2 images from the Eosinophil class and 1 from the Neutrophil class were misclassified. For the LISC data set, 3 images from the Eosinophil class, 3 from the Lymphocyte class, and 1 image from the Monocyte class were misclassified. According to the data presented in Table 3, for the Kaggle data set, overall accuracy is 99.7% and F1 score is 0.99 and for the LISC data set overall accuracy is 98.23% is F1 score 0.98. Another parameter used to determine the classification performance is the Area Value Under the ROC curve known as AUC. The size of the area under the curve indicates the accuracy and reliability of the classification model. Figure 5 shows the ROC curves obtained with the Alexnet-Gogglenet-SVM hybrid model for Kaggle data set. For the Eosinophil, Lymphocyte, Monocyte and Neutrophil class in the Kaggle data set, the AUC is 0.984, 0.999, 0.996 and 0.984, respectively.

Fig. 5
figure 5

Eosinophil, lymphocyte, monocyte and neutrophil ROC curves for Kaggle data set

Figure 6 shows the ROC curves obtained with the Alexnet-Gogglenet-SVM hybrid model for the LISC data set. For the Eosinophil, Monocyte, Lymphocyte and Neutrophil class in the LISC dataset, the AUC is 0.994, 0.974, 0.967 and 0.998, respectively.

Fig. 6
figure 6

Eosinophil, lymphocyte, monocyte and neutrophil ROC curves for LISC data set

The proposed method successfully classifies according to the ROC curves drawn separately for each class and the performance parameter values obtained. To prove this success, a literature comparison is presented in Table 4.

Table 4 Comparison of the proposed method with the literature

Considering the results obtained for the Kaggle data set, in most of the classification metrics for each WBC class, our Hybrid CNN model has better value than other studies. Togaçar et al. and Kutlu et, with the highest classification accuracy using CNN models. get. The accuracy of the studies conducted by 99%. When the results obtained with the Alexnet-Googlenet-SVM model were compared with the mentioned studies, more successful results were obtained in terms of accuracy and F1. Accuracy for each class is over 99%. The overall classification performance of our hybrid CNN model, especially in terms of F1 score and AUC, has shown that it is better than other CNN models, which provides reliability to our proposed hybrid CNN model. According to the results of the LISC dataset, the accuracy is greater than 98% for all WBC classes. The performance parameters obtained for this data set are in parallel with the studies in the literature.

5 Conclusion

The WBC test provides information about the amount of white blood cells in the blood. If the number of white blood cells is outside the normal range, this leads to the occurrence of various diseases. In this article, Eosinophil, Lymphocyte, Monocyte, and Neutrophil WBC images in 2 different data sets are classified by the Alexnet-Googlenet-SVM hybrid CNN method. There are two important reasons behind the success of our proposed methods. The first is to evaluate Alexnet's pool5, Googlenet's feature vector in the Avg_pool layer. For this, the feature with the maximum value from two feature vectors was used for classification. The second is the use of SVM, a powerful classification algorithm. Considering both the literature and the results obtained, we can also claim that our Hybrid CNN model can be used for the application of medical diagnostic systems.