1 Introduction

Cardiovascular disease (CVD) is currently the leading cause of death worldwide [14]. CVDs are heart disorders caused by coronary heart disease, cerebrovascular diseases, heart failure, and other types of pathology [29]. Heart disease is mainly caused by the failure of the heart to pump enough blood into the body [1]. The most important risk factors for heart disease are an unhealthy diet, advanced age, smoking, high blood pressure, alcohol consumption, and physical inactivity. According to a World Health Organization (WHO) report [44], heart diseases kill approximately 17.9 million people each year. In the same context, the American Heart Association reported that nearly half of American adults, or approximately 121.5 million adults, are affected by CVDs [7]. Thus, it is crucial to detect cardiac factors early to effectively treat cardiac patients before a heart attack or stroke [26]. To our best knowledge, there are still few works deal with CVD prediction [13, 19]. In addition, most existing approached optimize one objective only such as model’ accuracy [27]. However, multiple conflicting objectives (such as sensitivity, specificity, precision, and F1 score) are expected to be optimized simultaneously. To that end, automated CVD prediction is one of the most important and challenging tasks globally.

Diagnostic testing and wearable monitoring are currently the two most common methods for detecting CVDs. However, extracting useful risk factors for heart disease from electronic diagnostic tests is extremely difficult because computerized medical records are unstructured and permanently expand in size [27, 32]. One obvious solution is to use a smart system, which will instantly combine the data from wearable monitoring and medical records, evaluate the data gathered, diagnose any concealed heart attack warning symptoms, and predict cardiac failure. In smart CAD diagnostic systems, Machine Learning (ML) models play a vital role due to their efficiency in classifying heart patients as normal or abnormal and predicting the output from existing data over relatively short periods to address the problem of heart disease prediction [8, 16, 36, 48]. Recenly, many ML approaches have been developed to detect the disease from training datasets containing both inputs and outputs [2, 20, 47]. For example, Naïve Bayes (NBs), Random Forest (RF) [21], Logistic Regression (LR), K-Nearest Neighbor (k-NN) [17], Support Vector Machine (SVM) [18], and Artificial Neural Networks (ANN) [10, 11] are examples of ML algorithms. These state-of-the-art models were widely commercialized and frequently enhanced by professionals in academia and industry. Although these approaches may provide the solution to the problem of heart disease detection, they are easy to drop into local extreme solutions [9]. This can be a challenge for ML models in the medical domain, where it is critical to detect heart disease with very high accuracy.

Among ML models, SVM has shown effective performance in various classification and prediction problems in many fields. Some of the current real-life applications of SVM include fault detection [45], classification of images [25], and bio-medical [5]. The use of the optimal hyperplane that separates cases of different class labels is responsible for SVM’s powerful learning capability. Moreover, SVM is sensitive to its hyperparameters, which have a direct effect on efficiency and accuracy [38].Therfore, SVM is combined with various optimization algorithms due to its strong dependence on specific parameters, such as modified binary particle swarm optimization (MBPSO) [40], genetic SVM, and analysis of variance (GSVMA) [15], gray wolf optimization (GWO) [4], genetic algorithm (GA), improved bacterial foraging optimization-based twin support vector machine (IBFO-TSVM) [30], a novel cuckoo search approach called CS-PSO-SVM [23], grasshopper optimization algorithm (GOA) [3], and ant colony optimization [34]. These hybrid ML methods have produced results that outperform conventional models.

Many optimization techniques are available in the field of swarm intelligence algorithms. Among them is the Particle Swarm Optimization (PSO) algorithm, a population-based stochastic optimization algorithm with a few parameters [12]; hence, it is simple to implement [41]. PSO is an important component in ML models because it is used to adjust SVM parameters [6]. Furthermore, PSO was used to adjust the weights of the back-propagation neural network, and it produced better results than the conventional back-propagation method [37]. Moreover, it was used to select the best feature subsets. Swarm intelligence algorithms typically identify the optimum solution when it exists, but because PSO is stochastic, they cannot be relied on to find the best solution for any given problem. Since 1995, many variants have been developed to improve the performance of the PSO and achieve good results. Liu and Fu [23] used the Chaos theory to modify the PSO inertia weight parameter. The proposed chaotic PSO balances the exploration and exploitation phases effectively. Recenly, Sun et al. [35] suggested quantum theory into PSO and developed a quantum-behaved PSO (QPSO) algorithm. QPSO achieved better results than traditional PSO, which effectively moves toward the best optimal solutions in the search space. Unlike the PSO method, the QPSO method does not require velocity vectors to move the particles and has fewer adjusting parameters overall [37]; hence, the QPSO algorithm is easier to construct than the PSO algorithm. Therefore, in this study, QPSO is uttilied to modify the optimal SVM parameter to accurately predict heart disease.

This study mainly aimed to develop a novel QPSO-SVM approach for detecting heart diseases and predicting cardiac disease. This approach takes the proprieties of SVM—simplicity, fast classification in healthcare applications, and efficiency—while avoiding its convergence to a local minimum by training SVM using a QPSO, a new version of PSO that is inspired by birds’ flock searching for the location of food. To the best of our knowledge, this study developed the first method for detecting heart illnesses using the hybrid quantum PSO and an SVM learning algorithm. The QPSO algorithm is an excellent choice for heart disease identification, and it works well on high-dimensional problems and has balanced exploitation and exploration capabilities. The first step in the QPSO-SVM model is data preparation, which includes cleaning up all datasets required to train the model and extracting information that can be used for decision-making. QPSO computes the fitness value after the parameters’ values have been adjusted. The entire swarm then evolves to generate new solutions. Subsequently, an adaptive threshold is used at the end of each generation to maintain a balance between exploitation and exploration. Finally, to evaluate the performance of the proposed QPSO-SVM, Cleveland heart disease datasets were used to build the model and compare it to some state-of-the-art methods. The QPSO-SVM outperforms other techniques in terms of classification accuracy. Furthermore, the experiments are repeated for different values of the re-adapt control parameters to show the classifiers’ sensitivity to the parameter values.

The contributions of the work are summarized as follows:

  1. (1)

    An improved SVM trained by a quantum-behaved PSO (QPSO) algorithm is proposed to select the optimal values of the SVM parameters and improve classification accuracy. Anovel QPSO-SVM was proposed by integrating QPSO and SVM to improve prediction accuracy.

  2. (2)

    The QPSO-SVM was trained and learned using public heart disease datasets to forecast the patients’ heart disease status based on their current state.

  3. (3)

    The proposed QPSO-SVM is evaluated and compared with the results of previous studies using evaluation metrics such as accuracy, specificity, precision, G-Mean, and F1 score. Additionally, the statistical analysis was presented to evaluate the QPSO-SVM significance compared to other models.

The remainder of this study is structured as follows: Section 2 summarizes the literature review. Section 3 describes two algorithms, SVM and QPSO. Section 4 provides a detailed explanation of the proposed QPSO-SVM model for heart disease prediction. Section 5 presents the experimental results. Section 6 contains the conclusions and recommendations for future work.

2 Literature review

This section provides a literature review of many variants of ML models used to predict heart disease. Recently, Joloudari et al. [18] suggested a new and highly efficient CAD diagnosis called GSVMA to help the effective diagnosis and prediction of CAD by selecting key features. There are two key blocks to the GSVMA approach. The first method is genetic optimization, which selects the most important features. The second is the SVM algorithm with an ANOVA kernel, which is used to categorize the input dataset. In comparison to the previously described techniques, the proposed GSVMA method performed the best on 31 of 55 features in terms of accuracy (89.45%), F-measure (80.49%), specificity (100%), sensitivity (81.22%), and area under the curve (AUC) (100%). In the same context,Perumal [28] investigated the effect of using three ML classifiers, namely, k-NN, SVM, and LR, based on the principal component analysis method. Their approach achieved 85%, 87%, and 69% accuracy rates for SVM, LR, and k-NN, respectively. It is shown that SVM and LR provide almost similar accuracy values than k-NN. Furthermore, Liu et al. [24] proposed a new model for heart disease diagnosis based on the ReliefF and Rough set (RFRS) methods. They provided two subsystems: the classification system and the RFRS feature selection system. The first system is divided into three phases: data normalization, feature extraction using the ReliefF method, and feature reduction using RFRS. The second system uses an ensemble classifier based on the C4.5 classifier. The experiments in this study were performed on the University of California Irvine (UCI) database based on the classifier C4.5. Based on a jackknife cross-validation method, it achieved a maximum classification accuracy of 92.59%. Similary, Obasi and Shafiq [8] proposed a new ML model based on existing techniques such as LR, RF, and NB classifiers. The proposed system based on the medical records of patients with 1990 observations and 18 features achieved the highest accuracy of 92.44%, 61.96%, and 59.7% for LR, RF, and NB classifiers, respectively. Besides, Latha and Jeeva [21] improved the accuracy of heart disease risk prediction based on various classifiers. An ensemble method was used, including bagging, boosting, voting, and stacking. When bagging and boosting were used, the accuracy increased by a maximum of 6.92% and 5.94%, respectively. While C4.5, PART, and Multilevel perceptrons generate an accuracy of less than 80%, the NB classifier achieves high accuracy of 83.17%.

Aljarah et al. [3] recently implemented a hybrid GOA and SVM to maximize SVM classification accuracy. The hybrid GOA-SVM was tested on 18 datasets. The experimental results are compared with GA, PSO, GWO, CS, firefly algorithm (FF), bat algorithm, and multi-verse optimizer. Although lots of efforts have been made in GOA-SVM, it has some drawbacks of being trapped in local optima. Furthermore, Vieira et al. [40] proposed MBPSO, a modified binary PSO method for detecting patients with septic shock. The experimental results show that the MBPSO achieves high accuracy compared to other PSO-based algorithms. Similary, Wei-jia et al. [43] proposed a new detection algorithm for heart diseases based on a hybrid PSO-SVM algorithm wherein PSO automatically reduces the number of features to improve SVM classifier accuracy. Experimental results are compared with other algorithms such as an ANN and feature selection-based SVM (FS-SVM). The generated rule showed that men are more likely to develop coronary heart disease than women. Subsequently, Al-Tashi et al. [4] proposed the GWO-SVM classification model for predicting heart disease. The proposed method is a combination of the feature selection method by GWO and classification by SVM. In GA-SVM, the GA is used to select the more important features, and experimental results showed that the GA-SVM model outperformed current models in terms of accuracy. When classifying heart disease using all features, the SVM achieved 83.70% accuracy. However, when classifying heart disease using the selected features, the SVM classifier achieves an accuracy of 88.34%. Liu and Fu [23] introduced a novel approach called CS-PSO-SVM for disease diagnosis depending on the hybridization of cuckoo search (CS), PSO, and SVM. CS is used as a search algorithm for finding the best initial parameter of the kernel function in SVM. Thereafter, PSO used the SVM training part to determine the best SVM parameters. The experimental results showed that CS-PSO-SVM outperforms PSO-SVM and GA-SVM in terms of classification accuracy and F-measure. However, the classification accuracy in CS-PSO-SVM still needs to be improved. Subanya and Rajalaxmi [33] developed ABC-SVM, an Artificial Bee Colony (ABC) algorithm based on swarm intelligence to identify the optimal features for heart disease detection. The SVM method is used to examine ABC’s fitness. The Cleveland heart disease dataset from the VCI ML repository is used to validate the performance of the proposed algorithm. The experimental results indicate that the ABC-SVM strategy can outperform current models in terms of classification accuracy. ABC-SVM generated an accuracy of 85.29% in the first test with five features. With seven features and the same dataset in the second experiment, it obtained an accuracy of 86.76%.

Recently, Wang et al. [42] proposed the cloud-random forest (C-RF) model, which combines the cloud model and random forest to estimate the risk of coronary heart disease. The proposed model is based on the conventional classification and regression trees (CART). It compares the C-RF model with CART, SVM, Convolutional neural network ( CNN), and RF using standard performance measures including accuracy, error rates, ROC curve, and AUC value. In comparison to CART, SVM, CNN, and RF, the C-RF model's classification accuracy is 85%, which is an improvement of 8, 9, 4, and 3%, respectively. As a result, the C-RF model performs better in terms of classification effect and performance when assessing the risk of coronary heart disease. Lin e al. [22] also used PSO to select features and determine parameter values. The SVM is then used to evaluate the classification using the selected subset. Comparing the results with those from other methods indicated that the proposed method, PSO-SVM with feature selection, performs better than PSO-SVM without feature selection. However, the experiment results of PSO-SVM must use other classifications to increase their performance. Reddy et al. [31] employed several machine learning models for efficient heart disease risk predictions such as NB, Sequential Minimal Optimization (SMO) [28], instance-based classifier (IBk), AdaBoostM1 with decision stump (DS) [24], AdaBoostM1 with LR, bagging with REPTree, bagging with LR [43], and RF. Based on the results of the experiment, SMO achieved accuracy of 85.148% using the entire set of features. Additionally, when using the chi-squared attribute evaluation method, it produced the highest accuracy of 86.468%.

Although many studies have been conducted to predict the risk of heart disease [46], they have some drawbacks such as being trapped in local optima and the need to improve the detection rate. Therefore, in this study, a novel approach, called QPSO-SVM, was proposed to effectively detect the risk of heart disease. The QPSO algorithm was used because it is simple to develop and apply, has few parameters, and produces high classification accuracy. The proposed model was applied to the Cleveland heart disease dataset that is available from the University of California, Irvine repository with two groups. Each dataset group has 303 patients and 13 attributes.

3 Preliminaries

In this section, SVM and PSO are briefly explained here.

3.1 SVM

The SVM is a well-known ML technique, developed by Vapnik in [39]. SVM maximizes the margin between the positive and negative data points closest to the decision hyperplane in an N-dimensional space when there are several classes to distinguish. The effectiveness of SVM is significantly impacted by nonlinearly separable data. However, this challenge can be overcome by transferring the data from the input space to a new, higher-dimensional space using one of the kernel functions. The objective of this kernel function is to identify the optimal decision plane.

SVM is a popular ML model that is widely used in heart disease risk prediction [3,4,5, 15, 23, 25, 30, 34, 38, 40, 45]. The diagnosis of heart disease is considered an SVM classification problem that assigns the feature vector of patient \(\overrightarrow{\mathrm{x}}=\left[{\mathrm{x}}_{1},{\mathrm{x}}_{2},\dots ,{\mathrm{x}}_{\mathrm{n}}\right],\) to a class \({\mathrm{y}}_{\mathrm{j}}\in \mathrm{Y}={\{\mathrm{y}}_{1},{\mathrm{y}}_{2},\dots ,{\mathrm{y}}_{\left|\mathrm{Y}\right|}\}\) or not, where Y is a set of classes. Assume that, there are N training sets \(\left\{\left({\overrightarrow{\mathrm{x}}}_{1},{\overrightarrow{\mathrm{y}}}_{1}\right),\left({\overrightarrow{\mathrm{x}}}_{2},{\overrightarrow{\mathrm{y}}}_{2}\right),\dots ,({\overrightarrow{\mathrm{x}}}_{\mathrm{N}},{\overrightarrow{\mathrm{y}}}_{\mathrm{N}})\right\}, {\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\in {\mathrm{R}}^{\mathrm{d}}, {\mathrm{and\; y}}_{\mathrm{i}}\in (\pm 1), 1 \le \mathrm{ i }\le \mathrm{ N}\), where \({\mathrm{y}}_{\mathrm{i}},{\mathrm{y}}_{\mathrm{i}},\dots ,{\mathrm{y}}_{\mathrm{N}}\) indicate the class labels for feature vectors \(\left\{{\overrightarrow{\mathrm{x}}}_{1},{\overrightarrow{\mathrm{x}}}_{2},\dots ,{\overrightarrow{\mathrm{x}}}_{\mathrm{N}}\right\}\), respectively. In linearly separable data, the line \({\overrightarrow{\upomega }}^{\mathrm{T}}. {\overrightarrow{\mathrm{x}}}_{\mathrm{i}}+\mathrm{ b}=0\) represents the decision boundary between the two classes, positive and negative classes, where \(\overrightarrow{\upomega }\) is a weight vector, \(b\) is the bias, and \({\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\) is the input data. The goal of the SVM is to find the best parameters of \(\overrightarrow{\upomega }\) and \(\mathrm{b}\) that construct the planes \({\mathrm{H}}_{1}\mathrm{ \;and\; }{\mathrm{H}}_{2}, {\mathrm{where \;H}}_{1}\to {\overrightarrow{\upomega }}^{\mathrm{T}}. {\overrightarrow{\mathrm{x}}}_{\mathrm{i}}+\mathrm{ b}\ge +1\) for positive class and \({\mathrm{H}}_{2}\to {\overrightarrow{\upomega }}^{\mathrm{T}}. { \overrightarrow{\mathrm{x}}}_{\mathrm{i}}+\mathrm{ b}\le -1\) for the negative class, as shown in Fig. 1. Generally, SVM maximizes the margin between the positive and negative data points closest to the decision hyperplane. Here, the planes \({\mathrm{H}}_{1}\mathrm{\; and \;}{\mathrm{H}}_{2}\) can be combined as follows, \({\mathrm{y}}_{\mathrm{i}}\left({\overrightarrow{\upomega }}^{\mathrm{T}}. { \overrightarrow{\mathrm{x}}}_{\mathrm{i}}+\mathrm{ b}\right)-1\ge 0 \forall \mathrm{ i}=\mathrm{1,2},\dots ,\mathrm{N}\), where \({\mathrm{y}}_{\mathrm{i}}\in \left\{\pm 1\right\}.\) Here, the hyperplanes are formulated as an optimization problem in standard SVM, by using Eq. (1) to distinguish between the negative and positive classes, which represent the margin of SVM.

Fig. 1
figure 1

SVM classifier

$$\begin{array}{c}\mathbf{M}\mathbf{i}\mathbf{n}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e}\{\frac{1}{2}{\overrightarrow{{\varvec{\upomega}}}}^{\mathbf{T}}{\cdot}\overrightarrow{{\varvec{\upomega}}}\}\\ \mathbf{s}.\mathbf{t}.{\mathbf{y}}_{\mathbf{i}}\left({\overrightarrow{{\varvec{\upomega}}}}^{\mathbf{T}}{\cdot}{\overrightarrow{\mathbf{x}}}_{\mathbf{i}}+\mathbf{b}\right)-1\ge 0\forall \mathbf{i}=1,2,\dots ,\mathbf{N}\end{array}$$
(1)

In the case of nonlinearly separable data, the standard SVM cannot be classified new cases into the correct class. The SVM introduces a kernel function (ψ) that maps the training data into a higher-dimensional space to avoid misclassification. The formulation of the objective function of SVM is given by Eq. (2):

$$\begin{array}{c}\mathbf{M}\mathbf{i}\mathbf{n}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e}\left\{\frac{1}{2}{\overrightarrow{{\varvec{\upomega}}}}^{\mathbf{T}}{\cdot}\overrightarrow{{\varvec{\upomega}}}\right\}+\mathbf{C}\sum_{\mathbf{i}=1}^{\mathbf{N}}{\in }_{\mathbf{i}}\\ \mathbf{s}.\mathbf{t}.{\mathbf{y}}_{\mathbf{i}}\left({\overrightarrow{{\varvec{\upomega}}}}^{\mathbf{T}}{\cdot}{\varvec{\uppsi}}({\overrightarrow{\mathbf{x}}}_{\mathbf{i}})+\mathbf{b}\right)-1+{\in }_{\mathbf{i}}\ge 0\;\forall \mathbf{i}=1,2,\dots ,\mathbf{N}\end{array}$$
(2)

Here, C is a penalty parameter between \({\in }_{\mathrm{i}}\) and margin size, and \({\in }_{\mathrm{i}}\) represents a slack variable.

In a nonlinear SVM classifier, the feature vector \({\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\) is labeled as \({\mathrm{i}}^{*}\) if the objective function \({\mathrm{f}}_{\mathrm{i}}\) generates the highest value for \({\mathrm{i}}^{*}\) as follows:

$${\mathbf{i}}^{\mathbf{*}}=\mathbf{a}\mathbf{r}\mathbf{g}\mathbf{m}\mathbf{a}\mathbf{x}\;{\mathbf{f}}_{\mathbf{i}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)=\mathbf{a}\mathbf{r}\mathbf{g}\mathbf{m}\mathbf{a}\mathbf{x}\;{\mathbf{f}}_{\mathbf{i}}\left(\left({\overrightarrow{{\varvec{\upomega}}}}^{\mathbf{T}}{\cdot}{\varvec{\uppsi}}({\overrightarrow{\mathbf{x}}}_{\mathbf{i}})+{\mathbf{b}}_{\mathbf{i}}\right)\right)\;\forall \mathbf{i}=1,2,\dots ,\mathbf{N}$$
(3)

The results of \({\mathrm{i}}^{*}\mathrm{ th}\) objective function may be positive or negative as given in Eq. (4):

$$\begin{array}{cc}{{\varvec{f\;}}}_{{\varvec{i}}={\boldsymbol{ }{\varvec{i}}}^{\boldsymbol{*}}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)>0,& {{\varvec{\;f\;}}}_{{\varvec{i}}\ne {\boldsymbol{ }{\varvec{i}}}^{\boldsymbol{*}}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)<0.\end{array}$$
(4)

During classification, the feature vector \({\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\) that does not satisfy Eq. (4) is not classified and defined as an ambiguous case as follows:

$$\begin{array}{cc}{\forall \overrightarrow{\overline{{{\varvec{x}} }_{{\varvec{i}}}}}\notin \{\boldsymbol{ }{\overrightarrow{\mathbf{x}}}_{\mathbf{i}}|\boldsymbol{ }{\varvec{f}}}_{{\varvec{i}}={\boldsymbol{ }{\varvec{i}}}^{\boldsymbol{*}}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)>0,& {{\varvec{f}}}_{{\varvec{i}}\ne {\boldsymbol{ }{\varvec{i}}}^{\boldsymbol{*}}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)<0\boldsymbol{ }\}\end{array}$$
(5)

Here, the ambiguous vectors are classified using the naïve bayes method. According to naïve bayes, the probability that the ambiguous vector \({\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\) belongs to a class \({C}_{j}\) is defined using Eq. (6):

$${\varvec{P}}({{\varvec{C}}}_{{\varvec{j}}}\boldsymbol{ }|\boldsymbol{ }{\overrightarrow{\mathbf{x}}}_{\mathbf{i}})=\boldsymbol{ }\frac{{\varvec{P}}\left({{\varvec{C}}}_{{\varvec{j}}}\right)\boldsymbol{ }{\varvec{P}}({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\boldsymbol{ }|\boldsymbol{ }{{\varvec{C}}}_{{\varvec{j}}})}{{\varvec{P}}\left({\overrightarrow{\mathbf{x}}}_{\mathbf{i}}\right)}\boldsymbol{ },\boldsymbol{ }\forall \mathbf{i}=1,2,\dots ,\mathbf{N}$$
(6)

The ambiguous vector \({\overrightarrow{\mathrm{x}}}_{\mathrm{i}}\) is labeled as \({i}^{*}\), if the conditional probability \(P({C}_{j} | { \overrightarrow{\mathrm{x}}}_{\mathrm{i}})\) is the highest for \({i}^{*}\), as given in Eq. (7):

$${{\varvec{i}}}^{\boldsymbol{*}}=\boldsymbol{ }{\varvec{a}}{\varvec{r}}{\varvec{g}}\boldsymbol{ }{\varvec{m}}{\varvec{a}}{\varvec{x}}\boldsymbol{ }({\varvec{P}}({{\varvec{C}}}_{{\varvec{j}}}\boldsymbol{ }|{\overrightarrow{\mathbf{x}}}_{\mathbf{i}}))\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\boldsymbol{ }\;\forall \mathbf{i}=1,2,\dots ,\mathbf{N}$$
(7)

3.2 PSO algorithm

Kennedy and Eberhart in [12] proposed the PSO optimization algorithm. A swarm in PSO is made up of a fixed number of particles. Each particle represents a single solution to the d-dimensional space optimization problem. Specifically, two types of information determine how the ith particle moves in d-dimensional space. The first type is the historically best position of the ith particle, denoted by Pbest(t) (i). The second type of information is the best among all Pbesti in the whole swarm. The global best position, denoted by gbest(t), has the highest fitness value and its velocity can be updated as follows:

$$\begin{array}{c}{\mathbf{V}}^{\left(\mathbf{t}+1\right)}(\mathbf{i})={\varvec{\upomega}}\mathbf{x}{\mathbf{V}}^{\left(\mathbf{t}\right)}(\mathbf{i})+{\mathbf{C}}_{1}{\mathbf{x}\mathbf{r}1\mathbf{x}}^{\left(\mathbf{t}\right)}(\mathbf{i})({\mathbf{P}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}}^{\left(\mathbf{t}\right)}(\mathbf{i})-{\mathbf{X}}^{\left(\mathbf{t}\right)}(\mathbf{i}))+{\mathbf{C}}_{2}\mathbf{x}\mathbf{r}2{\mathbf{x}}^{\left(\mathbf{t}\right)}\left(\mathbf{i}\right)\left({\mathbf{g}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}}^{\left(\mathbf{t}\right)}-{\mathbf{X}}^{\left(\mathbf{t}\right)}\left(\mathbf{i}\right)\right)\\ ={\mathbf{V}}^{(\mathbf{t}+1)}(\mathbf{i})+{\mathbf{X}}^{(\mathbf{t})}(\mathbf{i})\end{array}$$
(8)

where V(i) and X(i) represent the velocity and position of the ith particle in the D-th dimensional space, respectively, \(\left|\mathrm{V}(i)\right|\le {V}_{max}.\) r1 and r2 are random numbers with a uniform distribution in the range of [0, 1]. C1 and C2 are the cognition learning factors.

After evaluating the initial population, each particle in the swarm will calculate the weighted average position S(t) (i) = [S(t) (i, 1), S(t) (i, 2),... S(t) (i, d)] of their own historically best position Pbest(t) (i) and global best position gbest(t) as the attraction point and gradually move to this point. The formula for calculating the weighted average position S(t) (i, d) is as follows:

$${{\varvec{S}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\boldsymbol{ }=\boldsymbol{ }\frac{{{\varvec{C}}1\boldsymbol{ }.\boldsymbol{ }{\varvec{r}}1}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right){.\boldsymbol{ }{\varvec{P}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right)\boldsymbol{ }+\boldsymbol{ }{{\varvec{C}}1\boldsymbol{ }.\boldsymbol{ }{\varvec{r}}1}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right)\boldsymbol{ }.\boldsymbol{ }{{\varvec{g}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}}^{\left({\varvec{t}}\right)}}{{{\varvec{C}}1\boldsymbol{ }.\boldsymbol{ }{\varvec{r}}1}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right)\boldsymbol{ }+\boldsymbol{ }{{\varvec{C}}1\boldsymbol{ }.\boldsymbol{ }{\varvec{r}}1}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right)\boldsymbol{ }}\boldsymbol{ }\boldsymbol{ }1\le {\varvec{i}}\le {\varvec{N}},\boldsymbol{ }1\le {\varvec{d}}\le {\varvec{D}}$$
(9)

When the value of \({V}_{max}\) parameter is incorrect, particles are prevented from transitioning too quickly from breadth to depth search, causing the particle track to frequently drop into local optima. Sun, Fang, Wu, and Palade [35] proposed a quantum-behaved PSO approach to improve the performance of the PSO algorithm, which is represented by the Schrödinger equation ψ(x, t) [35], rather than position and velocity in the original PSO algorithm.

4 Proposed model

The proposed new model in this section is based on the quantum-behaved particle swarm optimization (QPSO) algorithm and SVM to analyze and diagnose heart disease risk using a real dataset. The proposed QPSO-SVM classification model consists of three phases. First, the data were processed by converting nominal data into numerical data and applying effective scaling techniques. Second, QPSO automatically adjusts the SVM parameters. Finally, the improved QPSO-SVM performs the classification tasks. Figure 2 shows the flow diagram of the proposed methodology for heart disease detection. Here, a simple and efficient hybrid model is suggesed to improve optimization capabilities without increasing the computational complexity.

Fig. 2
figure 2

The flow diagram of the proposed QPSO-SVM classification model for heart diseases detection

4.1 Data preprocessing

Preprocessing data is the most important step before implementing the proposed model. However, real-world data cannot be directly used in the prediction task because it appears disorganized, incomplete, and contradictory. First, the categories on the training set are converted to a numerical representation, which is then applied equally to the test set. The categorical columns have values ranging from 0 to n − 1, where n is the number of categories. Next, the data is scaled from 0 to 1 using a min–max method after the missing values are replaced with random uniform noise ranging from 0 to 0.01.

4.2 Quantum-behaved particle swarm optimization algorithm

The quantum PSO (QPSO) algorithm assumes that the dynamic behavior of the particles in the PSO system meets the fundamental premise of quantum mechanics [35]. QPSO trains SVM by adjusting its parameters to avoid falling into local optima. QPSO uses the Monte Carlo method to calculate the position of the particle. The updated formula of QPSO is defined as:

$${{\varvec{S}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\boldsymbol{ }=\boldsymbol{ }{\varvec{\psi}}\boldsymbol{ }{\cdot}\boldsymbol{ }{\mathbf{P}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},{\varvec{d}}\right)+\left(1-{\varvec{\psi}}\right){\cdot}\boldsymbol{ }{\mathbf{g}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}(\mathbf{d})}^{\left({\varvec{t}}\right)}$$
(10)
$${\varvec{m}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}\left({\varvec{d}}\right)=\boldsymbol{ }\frac{1}{{\varvec{N}}}\boldsymbol{ }\sum\nolimits_{{\varvec{i}}=1}^{{\varvec{N}}}\mathbf{P}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}\left({\varvec{d}}\right)$$
(11)
$${{\varvec{X}}}^{\left({\varvec{t}}+1\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)=\boldsymbol{ }{{\varvec{S}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\pm{\varvec{\theta}}\left|{\varvec{m}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}\left({\varvec{d}}\right)-\boldsymbol{ }{{\varvec{X}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\right|{\cdot}\mathbf{ln}\left(\frac{1}{{{\varvec{u}}}^{\left({\varvec{t}}\right)}({\varvec{i}},{\varvec{d}})}\right)$$
(12)

where \({S}^{\left(t\right)}\left(i, d\right)\) is a random position between \(pbest\) and \(gbest\), \({u}^{\left(t\right)}(i,d)\) and \(\psi is\) a random number uniformly distributed in the range [0, 1], and \(\theta\) is an expansion coefficient.

Here, the Eqs. (10) and (12) were combined to get the generalized form of standard particle postion formula as follows:

$${{\varvec{X}}}^{\left({\varvec{t}}+1\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)=\boldsymbol{ }{\varvec{\psi}}\boldsymbol{ }{\cdot}\boldsymbol{ }\left({\mathbf{P}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},{\varvec{d}}\right)-{\mathbf{g}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}(\mathbf{d})}^{\left({\varvec{t}}\right)}\right)+{\mathbf{g}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}(\mathbf{d})}^{\left({\varvec{t}}\right)}\pm{\varvec{\theta}}\left|{\varvec{m}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}\left({\varvec{d}}\right)-\boldsymbol{ }{{\varvec{X}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\right|{\cdot}\mathbf{ln}\left(\frac{1}{{{\varvec{u}}}^{\left({\varvec{t}}\right)}({\varvec{i}},{\varvec{d}})}\right)$$
(13)

For every ith particle, select two particles from the swarm at \(rand\left(\mathrm{0,1}\right)\) that are not the same as ith particle, respectively, k and m, and \(i\ne k\ne m,\) it is possible to calculate the difference in location between particles k and m.

$${\varvec{\tau}}=\boldsymbol{ }{{\varvec{x}}}_{{\varvec{m}}}-{{\varvec{x}}}_{{\varvec{k}}}$$
(14)

Substitute Eq. (11) for \(\left({\mathrm{Pbest}}^{\left(t\right)} \left(i,d\right)-{\mathrm{ gbest}(\mathrm{d})}^{\left(t\right)}\right)\) from Eq. (10), and to increase the randomness, add \(rand\left(\mathrm{0,1}\right)\) to the second term \({\mathrm{gbest}(\mathrm{d})}^{\left(t\right)}\), and the revised evolution equation is as follows:

$${{\varvec{X}}}^{\left({\varvec{t}}+1\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)=\boldsymbol{ }{\varvec{\psi}}\boldsymbol{ }{\cdot}\boldsymbol{ }{{\varvec{\tau}}}_{{\varvec{d}}}+{\left(1-{\varvec{\psi}}\right)\boldsymbol{ }{\cdot}\mathbf{g}\mathbf{b}\mathbf{e}\mathbf{s}\mathbf{t}(\mathbf{d})}^{\left({\varvec{t}}\right)}\pm{\varvec{\theta}}\left|{\varvec{m}}{\varvec{b}}{\varvec{e}}{\varvec{s}}{\varvec{t}}\left({\varvec{d}}\right)-\boldsymbol{ }{{\varvec{X}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)\right|{\cdot}\mathbf{ln}\left(\frac{1}{{{\varvec{u}}}^{\left({\varvec{t}}\right)}({\varvec{i}},{\varvec{d}})}\right)$$
(15)

According to Eqs. (13) and (15), the position \({X}^{\left(t+1\right)}\left(i, d\right)\) is generated; thereafter, the individual \({X}^{\left(t+1\right)}\left(i, d\right)\) and the optimal position \({P}^{\left(t\right)}\left(i\right)\) is separated to calculate the test position T(t) (i) = [T(t) (i, 1), T(t) (i, 2),... T(t) (i, d)] as follows:

$${\mathbf{T}}^{({\varvec{t}}+1)}({\varvec{i}},{\varvec{d}})=\left\{\begin{array}{cc}{\boldsymbol{ }{\varvec{X}}}^{\left({\varvec{t}}+1\right)}\left({\varvec{i}},\boldsymbol{ }{\varvec{d}}\right)& \;\;{\varvec{r}}{\varvec{a}}{\varvec{n}}{\varvec{d}}\left(0,1\right)<{\varvec{\rho}}\\ {\boldsymbol{ }{\varvec{P}}}^{\left({\varvec{t}}\right)}\left({\varvec{i}}\right)& {\varvec{o}}{\varvec{t}}{\varvec{h}}{\varvec{e}}{\varvec{r}}\end{array}\right.$$
(16)

where \(\rho\) is the cross-over probability.

Using formula (13), the optimal position \({P}^{\left(t\right)}\left(i\right)\) of the ith particle is then updated:

$${\mathbf{P}}^{({\varvec{t}}+1)}({\varvec{i}},{\varvec{d}})=\left\{\begin{array}{cc}{\mathbf{T}}^{({\varvec{t}}+1)}({\varvec{i}},{\varvec{d}})& \kern2pc{\varvec{F}}({\mathbf{T}}^{\left({\varvec{t}}+1\right)}({\varvec{i}},{\varvec{d}})<{\varvec{F}}({\mathbf{P}}^{\left({\varvec{t}}\right)}\left({\varvec{i}},{\varvec{d}}\right))\\ {\mathbf{P}}^{({\varvec{t}})}({\varvec{i}},{\varvec{d}})& {\varvec{o}}{\varvec{t}}{\varvec{h}}{\varvec{e}}{\varvec{r}}\end{array}\right.$$
(17)

Here, the adaptive value function is denoted by F(*). The value of \(\rho\) has a significant impact on the QPSO algorithm’s search capabilities and convergence speed. A lower \(\rho\) allows individuals in a swarm to save more information while maintaining group variety, which benefits the algorithm’s global exploration. On the other hand, higher \(\rho\) encourages individuals to learn more empirical information in the entire swarm, speeding up the algorithm’s convergence.

4.3 Hybrid QPSO-SVM

The hybrid QPSO-SVM classification model for heart disease detection is proposed. The proposed model has three stages. First, it sets the population size to N, and each ith particle position in the d-th dimensional is S(t) (i) = [S(t) (i, 1), S(t) (i, 2),... S(t) (i, d)]. Set the number of particles to M, the maximum number of iterations to \({Max}_{iter}\), and the range of particle position and particle velocity. The model uses k-fold cross-validation to ensure its effectiveness. Second, the root mean square error (RMSE) was used by the QPSO-SVM algorithm as the fitness equation. RMSE is defined as follows: RMSE = \(\sqrt{\frac{\sum_{i=1}^{N}{Y}_{i}-\widehat{{Y}_{i}}}{N}}\), where \({Y}_{i}\) and \(\widehat{{Y}_{i}}\) denote the predicated and actual values, respectively; N is the number of test samples. Thereafter, the model checks the number of iterations; if it does not equal \({Max}_{iter}\), the model updates the individual optimal position Pbest(t) (i) of the ith particle, the velocity, and the position of the ith particle using Eqs. (8) and (9), respectively. The population of the next-generation S(t + 1) (i) is as follows: S(t + 1) (i) = [S(t1) (i, 1), S(t + 1) (i, 2),... S(t + 1) (i, d)]. If \({\mathrm{Max}}_{\mathrm{iter}}\) and k-fold satisfy, it computes the average RMSE and average accuracy of k-folds. Finally, return the best optimal position Pbest, global position gbest of the whole swarm.. The fitness function value of an individual should be less than the average for a problem to be minimized, indicating that the particle's neighboring search region is potential and promising. Furthermore, the fitness function is used to evaluate the quality of individual solutions in a population of candidate solutions, and the goal of the QPSO-SVM algorithm is to evolve a population of solutions towards higher fitness values.

In any classification process, selecting a decision threshold is one of the most difficult challenges. This paper introduces a self-adaptive Threshold scheme for tuning the QPSO-SVM parameters. The SVM classification equation is formulated as an optimization problem and solved using the QPSO. The method optimizes the threshold values through effectively exploring the solution space in obtaining the global best solution. In any classification process, selecting a decision threshold is one of the most difficult challenges.

The steps of the QPSO-SVM approach are described as follows:

  • Step 1: Set the number of particles to be M, the maximum number of iteration to be \({\mathrm{Max}}_{\mathrm{iter}}\); set the range of particle position and particle velocity;

  • Step 2: The ith particle position in the d-th dimensional is S(t) (i), S(t) (i) = [S(t) (i, 1), S(t) (i, 2),..., S(t) (i, d)]; the ith particle velocity is V(t) (i), V(t) (i) = [V(t) (i, 1), V(t) (i, 2),..., V(t) (i, d)];

  • Step 3: i is set to 1;

  • Step 4: Particle position S(t) (i) is related to the QPSO-SVM algorithm for training model with a large number of samples;

  • Step 5: After that the root mean square error (RMSE) was used by QPSO-SVM algorithm uses as the fitness. RMSE is defined as follows: RMSE = \(\sqrt{\frac{\sum_{\mathrm{i}=1}^{\mathrm{N}}{\mathrm{Y}}_{\mathrm{i}}-\widehat{{\mathrm{Y}}_{\mathrm{i}}}}{\mathrm{N}}}\) (where \({\mathrm{Y}}_{\mathrm{i}}\) and \(\widehat{{\mathrm{Y}}_{\mathrm{i}}}\) denote the predicated and acual values respectively; N is the number of test samples).

  • Step 6: Update the individual optimal position Pbest(t) (i) of the ith particle;

  • Step 7: If i ≤ M, goto to step 8; otherwise, i = i + 1, go back to step 4;

  • Step 8: Update the global best value gbest(t) (i) of the ith particle;

  • Step 9: If \(\mathrm{t}\ge {\mathrm{Max}}_{\mathrm{iter}}\), go to step 12; otherwise, go to step 10;

  • Step 10: Update the velocity and position of the the ith particle using Eqs. (14) and (15) respectively. The population of the next generation S(t+1) (i) is as follows, S(t+1) (i) = [S(t1) (i, 1), S(t+1) (i, 2),..., S(t+1) (i, d)].

  • Step 11:\(\mathrm{t}=\mathrm{t}+1\), go back to step 2;

  • Step 12: Return the best optimal position pbest and global position gbest of the whole swarm.

The QPSO-SVM model is explained in detail in Algorithm 1.

Algorithm 1:
figure a

Pseudo-code for QPSO-SVM classification model.

4.4 Measure for performance evaluation based on self-adapting threshold

In this study, a series of operating points are generated by applying classifier thresholds to obtain a multiclass ROC curve. Here, the adaptive threshold \(\mathrm{\yen }(i,t)\) of particle i at iteration t is defined as a minimization problem as follows:

$$\mathrm{\yen }(i,t) =\left\{\begin{array}{c}{ 1\;\mathrm{ if \;fit}(\mathrm{gbest}(\mathrm{i})}^{\left(t\right)}<{\mathrm{ fit}(\mathrm{gbest}(\mathrm{d})}^{\left(t\right)})\\ { 0\mathrm{ \;if \;fit}(\mathrm{gbest}(\mathrm{i})}^{\left(t\right)}={\mathrm{ fit}(\mathrm{gbest}(\mathrm{d})}^{\left(t\right)})\end{array}\right.$$
(18)

where \({\mathrm{fit}(\mathrm{gbest}(\mathrm{d})}^{\left(t\right)})\) is the best global position and fit() is the function to be optimized.

Sensitivity, specificity, accuracy, precision, G-Mean, and F-score are used in this work to assess the performance of the proposed QPSO-SVM model. Their computation requires the count of true positive and true negative at a given threshold ¥, which is calculated as follows using the confusion matrix:

  • Accuracy\(( \mathrm{\yen })\): It calculates the proportion of successfully identified cases in the test collection to the total number of observations. \({\varvec{T}}{\varvec{P}}(\mathrm{\yen })\) calculates the number of heart disease observations that the model properly classifies as heart illness at classifier threshold\(\mathrm{\yen }\). However, \({\varvec{T}}{\varvec{N}}(\mathrm{\yen })\) calculates the number of normal heart observations that the model properly classifies as the absence of heart illness at the classifier threshold\(\mathrm{\yen }\). Also, False Positive FP \((\mathrm{\yen })\) is the number of normal heart observations that the model wrongly classifies as heart disease.

    $$\mathbf{A}\mathbf{c}\mathbf{c}\mathbf{u}\mathbf{r}\mathbf{a}\mathbf{c}\mathbf{y}=\frac{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })+{\varvec{T}}{\varvec{N}}(\mathrm{\yen })}{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })+{\varvec{T}}{\varvec{N}}(\mathrm{\yen })+{\varvec{F}}{\varvec{P}}(\mathrm{\yen })+{\varvec{F}}{\varvec{N}}(\mathrm{\yen })}\times 100$$
    (19)
  • Recall\((\mathrm{\yen })\): It calculates the number of heart disease cases identified by the model divided by the total number of activities in the test set.

    $$\mathbf{S}\mathbf{e}\mathbf{n}\mathbf{s}\mathbf{i}\mathbf{t}\mathbf{i}\mathbf{v}\mathbf{i}\mathbf{t}\mathbf{y}(\mathbf{R}\mathbf{e}\mathbf{c}\mathbf{a}\mathbf{l}\mathbf{l})=\frac{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })}{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })+{\varvec{F}}{\varvec{N}}(\mathrm{\yen })}\times 100$$
    (20)
  • Specificity\((\mathrm{\yen })\): It correctly identifies people without the heart disease.

    $$\mathbf{S}\mathbf{p}\mathbf{e}\mathbf{c}\mathbf{i}\mathbf{f}\mathbf{i}\mathbf{c}\mathbf{i}\mathbf{t}\mathbf{y}=\frac{{\varvec{T}}{\varvec{N}}(\mathrm{\yen })}{{\varvec{T}}{\varvec{N}}(\mathrm{\yen })+{\varvec{F}}{\varvec{P}}(\mathrm{\yen })}\times 100$$
    (21)
  • Precision\((\mathrm{\yen })\): It calculates the amount of heart disease observations detected divided by of the total number of observations the model detect.

    $$\mathbf{P}\mathbf{r}\mathbf{e}\mathbf{c}\mathbf{i}\mathbf{s}\mathbf{i}\mathbf{o}\mathbf{n}=\frac{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })}{{\varvec{T}}{\varvec{P}}(\mathrm{\yen })+{\varvec{F}}{\varvec{P}}(\mathrm{\yen })}\times 100$$
    (22)
  • F1 Score\((\mathrm{\yen })\): It computes the weighted average of Precision rate and Recall. It is mainly used when the class distribution is unbalanced, and it is more valuable than accuracy because it calculates FP and FN.

    $$\mathbf{F}1\;\mathbf{S}\mathbf{c}\mathbf{o}\mathbf{r}\mathbf{e}\left(\mathbf{\yen }\right)=2\times \frac{\mathbf{P}\mathbf{r}\mathbf{e}\mathbf{c}\mathbf{i}\mathbf{s}\mathbf{i}\mathbf{o}\mathbf{n}\left(\mathbf{\yen }\right)\times \mathbf{R}\mathbf{e}\mathbf{c}\mathbf{a}\mathbf{l}\mathbf{l}\left(\mathbf{\yen }\right)}{\mathbf{r}\mathbf{e}\mathbf{c}\mathbf{i}\mathbf{s}\mathbf{i}\mathbf{o}\mathbf{n}\left(\mathbf{\yen }\right)+\mathbf{R}\mathbf{e}\mathbf{c}\mathbf{a}\mathbf{l}\mathbf{l}\left(\mathbf{\yen }\right)}$$
    (23)
  • G-Mean\((\mathbf{\yen })\): It is used to calculate the trade-off between sensitivity and specificity and it is an important measure for class imbalance problem.

    $${\varvec{G}}-{\varvec{M}}{\varvec{e}}{\varvec{a}}{\varvec{n}}\boldsymbol{ }\left(\mathbf{\yen }\right)={\varvec{s}}{\varvec{q}}{\varvec{r}}{\varvec{t}}({\varvec{S}}{\varvec{e}}{\varvec{n}}{\varvec{s}}{\varvec{i}}{\varvec{t}}{\varvec{i}}{\varvec{v}}{\varvec{i}}{\varvec{t}}{\varvec{y}}(\mathbf{\yen })-{\varvec{S}}{\varvec{p}}{\varvec{e}}{\varvec{c}}{\varvec{i}}{\varvec{f}}{\varvec{i}}{\varvec{c}}{\varvec{i}}{\varvec{t}}{\varvec{y}}(\mathbf{\yen }))$$
    (24)

5 Experimental results and discussion

The proposed new classification model for heart disease detection is implemented and tested using Python 3 on a PC with an Intel (R) Core (TM) i5-7200U and RAM of 16 GB. The proposed model was ran on the heart disease dataset to investigate the feasibility of QPSO-SVM. Table 1 shows detailed information about the UCI dataset properties. The data of 297 patients shows that 137 of them have a value of 1, indicating the presence of heart disease, while the remaining 160 have a value of 0, indicating the absence of heart disease. The class value is set to 1 if the patient has cardiac issues, while 0 indicating that they do not have heart disease.

Table 1 Cleveland heart disease dataset attributes detailed information

5.1 Parameters setting

Several parameters in the proposed QPSO-SVM model must be initialized before evaluation. QPSO-SVM was trained using grid search techniques. Table 2 shows the initial parameters for the competitive classification models GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVM. These parameters include the number of particles, cognition learning factors C1 and C2, mutation probabilityand other paramters.

Table 2 The Initial parameters of the classification models

5.2 Results of the Cleveland dataset

The effectiveness of all competitive classification algorithms, including PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVMs, is evaluated in this section using the Cleveland heart disease. The accuracy of classification models is calculated before and after threshold-offset tuning. Tables 3 and 4 show a comparison among the classification accuracies of the proposed QPSO-SVM and other classifiers from the literature for the heart disease dataset with grid search technique and using k-fold = 10. The comparison results of the classification accuracies among the proposed QPSO-SVM algorithm and the other comparative classification approaches (i.e., PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVM) are summarised in Table 3. It can be shown that QPSO-SVM achieved the highest classification accuracy under different grid parameters in comparison with other algorithms. From Table 3, it is seen that, the average accuracy achieved by ABCSVM, CS-PSO-SVM, and GOA-SVM algorithms were of 87.40%, 86.21%, and 83.98%, respectively. A slightly increased accuracy was achieved by PSO-SVM, GA–SVM, and GSVMA approaches, with an average accuracy of 88.65%, 89.28%, and 89.13%, respectively. At the same time, the GWO-SVM algorithm provides a competitive accuracy of 90.87%. However, the proposed QPSO-SVM gives an effective value with the best accuracy of 92.81%. Additionally, Table 4 displays the accuracy for the Cleveland heart disease dataset utilising classification models after threshold-offset tuning and using k-fold = 10. It shows that the average accuracy of the QPSO-SVM classifier is significant and superior to the other classifiers after threshold-offset adjustment and using k-fold = 10. QPSO-SVM has a 96.31% accuracy rate. Likewise, GA-SVM provided the second-best result with an accuracy of 93.01%. GOA-SVM, on the other hand, achieves worse outcomes and gives an accuracy rate of 87.38%. Here, to avoid local optimization, the QPSO algorithm acts in such a way that when it encounters such a location, the particles will be flied to different portions of the search space, where they will seek for optimised solutions, and this process will be repeated until the global optimised solutions are found. The technique works effectively when dealing with problems of very high dimensions as well as difficulties where the population is primarily unsuitably distributed due to the movement of the employed particles. So, the QPSO-SVM has a high success rate in addressing optimization issues.

Table 3 Accuracy for Cleveland heart disease dataset using classification models with grid search technique (before Threshold-offset tuning and using k-fold = 10)
Table 4 Accuracy for Cleveland heart disease dataset using classification models with grid search technique (after Threshold-offset tuning and using k-fold = 10)

Figure 3 illustrates the confusion matrix of correct predicted and false predicted heart detection using the proposed QPSO-SVM model and all competitive classification algorithms, including PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVMs. The proposed QPSO-SVM model has extremely few incorrect classifications, as can be seen in the figure, so the classification can be done accurately and effectively.

Fig. 3
figure 3figure 3

Confusion matrix for Cleveland heart disease dataset using competitive classification algorithms, including QPSO-SVM, PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVMs

5.3 Comparison between different classifiers

The standard statistical p-value is used in this study to determine whether the proposed QPSO-SVM algorithm’s mean coverage values are significantly lower than those of competing models for ten k-fold scenarios. Table 5 shows that the p-values for C(QPSO-SVM, PSO-SVM), C(QPSO-SVM, GA-SVM), C(QPSO-SVM, GSVMA), C(QPSO-SVM, CS-PSO-SVM), C(QPSO-SVM, GWO-SVM), and C(QPSO-SVM, GOA-SVM) are (p < 0.00001), (p < 0.02786), (p < 0.03192), (p < 0.21138),(p < 0.000012), (p < 0.29875), and (p < 0.5507), respectively, which indicated that there is a significant difference between the performance of QPSO-SVM and PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVM for ten k-fold cases for heart disease dataset. In addaion, it is clear that the mean coverage ratios (mean) (0.400944), standard deviation (0.022743) (SD), and covariance (CV) (4.45) of C(QPSO-SVM, PSO-SVM) are superior to the ratios C(QPSO-SVM, ABC-SVM) (0.302981, 0.021531, 4.41) and C(QPSO-SVM, GSVMA) (0.270304, 0.02031, 5.45), respectively. One of the reasons why the proposed QPSO-SVM classifier outperforms previous classifiers is because it is based on our proposed adapting threshold method.

Table 5 Comparison of QPSO-SVM’s SD, CV, P-value, and average mean with those of algorithms described in a previous study on the Cleveland dataset

Table 6 shows the superiority of the novel QPSO-SVM algorithm. The sensitivity (96.13%), specificity (93.56%), precision (94.23%), and F1 score (0.95%) achieved by QPSO-SVM after threshold-offset tuning are better than those obtained by QPSO-SVM before threshold-offset tuning. Overall, it is seen that the QPSO-SVM method outperformed all other competing algorithms in terms of sensitivity, specificity, precision, and F1 score.

Table 6 Accuracy, Recall, specificity, precision, recall, and F1score for Cleveland heart disease dataset using k-fold = 10 (Before and After Threshold-offset tuning)

5.4 Comparison of the ROC curves of classifiers

Figure 4 shows the ROC curves and Geometric Mean or G-Mean of all classification models (i.e., PSO-SVM, GA-SVM, ABC-SVM, GWO-SVM, CS-PSO-SVM, and GOA-SVM) for the Cleveland heart disease datasets before threshold-offset tuning. The proposed QPSO-SVM model performed best (ROC = 0.91) at the start and before threshold-offset tuning on the Cleveland heart disease dataset, followed by the GA-SVM model. On the other hand, GWO-SVM had the lowest ROC score (ROC = 0.79), followed by GOA-SVM and CS-PSO-SVM models, while ABC-SVM and GSVMA performed at the average level. It can be seen that QPSO-SVM had the highest Geometric Mean or G-Mean among all classification methods (G-Mean = 90.25), followed by GSVMA (G-Mean = 89.60) and CS-PSO-SVM (G-Mean = 83.97). The reason for this result is that the QPSO-SVM model eliminates the limited particle-to-particle communication within PSO, which makes it easy to fall into a local optimum in high-dimensional space and has a slow rate of convergence during the iterative process.

Fig. 4
figure 4figure 4

ROC curve and G-mean before Threshold-offset tuning of A QPSO-SVM, B PSO-SVM, C GA–SVM, D ABC-SVM, E GSVMA, F CS-PSO-SVM, G GWO-SVM, H GOA-SVM

Figure 5 shows the results of the ROC curve of the PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVM after threshold-offset tuning. Figure 5 also shows the effect of the preference factor of optimal threshold-offset tuning on the AUC area and G-Means. In the case of QPSO-SVM, the optimal ROC curve is around 0.37 with G-Means = 95.14 at the point of threshold-offset tuning. In contrast, PSO-SVM achieved ROC (area = 0.84) and 90.03 G-Mean at the optimal threshold of 0.21. Similarly, ABC-SVM produced 87.59 G-Mean and 0.86 ROC at a 0.35 threshold. Generally, the QPSO-SVM outperformed all other competing models in terms of G-Mean and AUC. The ABC-SVM has the worst performance in terms of G-Mean and AUC. Generally speaking, the proposed approach takes advantage of the QPSO and makes search capabilities more powerful and more stable for determining the best threshold values by utilizing the QPSO-SVM algorithm. The overall runtimes are not affected by the number of threshold values.

Fig. 5
figure 5figure 5

ROC curve and G-mean after Threshold-offset tuning of A QPSO-SVM, B PSO-SVM, C GA–SVM, D ABC-SVM, E GSVMA, F CS-PSO-SVM, G GWO-SVM, H GOA-SVM

6 Conclusion and future work

In recent decades, great progress has been made in cardiovascular disease research. Although many studies have been conducted to address heart risk detection, most of them are ineffective and have many limitations. In this study, a hybrid model, namely, QPSO-SVM, is proposed to analyze and predict heart disease risk. The QPSO-SVM model combines the benefits of QPSO, SVM algorithms, and an adaptive threshold method. First, the data preprocessing were performed by converting nominal data into numerical data and applying effective scaling techniques. Second, the SVM parameters are automatically adjusted by QPSO. Finally, the proposed QPSO-SVM is used to classify the input data. This proposed model was evaluated by tenfold cross-validation over the Cleveland dataset, which was divided into two groups: 80% for training and 20% for testing. Furthermore, the existing state-of-the-art methods, such as PSO-SVM, GA-SVM, ABC-SVM, GSVMA, CS-PSO-SVM, GWO-SVM, and GOA-SVM, have been used to predict cardiovascular diseases based on the Cleveland dataset. Experimental results show that the proposed QPSO-SVM algorithm achieves higher accuracy of 96.31% than existing algorithms. Besides, the proposed model outperforms other heart disease prediction models considered in this research in terms of classification accuracy, specificity, precision, G-Mean, F1 score. Furthermore, the QPSO-SVM achieved good accuracy and AUC rates compared with related work.