Quantum mayfly optimization based feature subset selection with hybrid CNN for biomedical Parkinson’s disease diagnosis

Mansour, Romany F.

doi:10.1007/s00521-024-09516-1

Quantum mayfly optimization based feature subset selection with hybrid CNN for biomedical Parkinson’s disease diagnosis

Original Article
Open access
Published: 22 February 2024

Volume 36, pages 8383–8396, (2024)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Quantum mayfly optimization based feature subset selection with hybrid CNN for biomedical Parkinson’s disease diagnosis

Download PDF

Romany F. Mansour ORCID: orcid.org/0000-0001-5857-8495¹

470 Accesses
Explore all metrics

Abstract

Parkinson's disease (PD) arises from brain cell damage and necessitates early detection for effective treatment and symptom management. While various methods such as voice, speech, and written exams have been explored, utilizing automated tools is crucial to enhance accuracy. Recent advancements in artificial intelligence (AI) and deep learning (DL) provide an opportunity for precise early-stage PD identification. This study introduces a novel approach known as Quantum Mayfly Optimization-based feature subset selection with hybrid convolutional neural network (QMFOFS-HCNN) to improve PD detection and classification. QMFOFS-HCNN is designed to identify optimal feature subsets and overcome the dimensionality challenge. It combines a quantum mayfly optimization approach for feature selection with a convolutional neural network with attention-based long short-term memory for PD detection and classification. Additionally, hyperparameter selection is optimized using the Nadam optimizer. Experimental validation using benchmark datasets yielded compelling results. The QMFOFS-HCNN technique achieved accuracy rates: 96.35% for HandPD Spiral, 96.7% for HandPD Meander, 98.5% for Speech PD, and a perfect 100% for Voice PD datasets. These quantitative findings underscore the potential of AI and DL to enhance early PD detection accuracy significantly. These results offer promising prospects for improving healthcare outcomes in managing PD and related neurological disorders.

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Article 13 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Parkinson's disease (PD), initially developed by James Parkinson, affects an individual's movement, leading to muscle stiffness, tremors, and changes in speech and writing skills [1]. This condition occurs when nerve cells produce a chemical called dopamine that breaks down, making nerve cells unable to transmit messages accurately, this condition occur may be due to genetic factors. It can result in depression, nervous disorders, and memory impairment [2]. Several authors have been conducted to diagnose the disease at its earlier stage, though not with great accomplishment. Identifying the disease in its earlier stages is significant so patients can live quality lives [3]. The disease at its advanced stage affects the day-to-day tasks, and the person might need help from others. The later stages of PD are sufficiently severe as the patient gets stiffness in the legs, making it impossible to stand or walk and might cause freezing on standing. Several methods have been utilized for identifying the disease correctly such as writing, speech, and voice exams [4]. The handwritten exam is widely employed for diagnosing PD, because it is easier to get data and inexpensive.

In recent times, data have been enhanced by the amount of features and instances that make data noisier [5]. The noisier data sets construct the algorithm to increase the computational cost, decrease the predicted accuracy, train the data slower, and increase the complexity. Thus, the feature selection (FS) method designed a significant process for the machine learning (ML) approach before training the model [6]. The task of processing and preprocessing data is a complex, as the increase in feature and instance count results in an increase in the quantity of data. Growth in data makes it more vulnerable to noise, which might result in degraded results and a drop in performance. Therefore, it becomes indispensable for treating the data [7]. Complexity tends and computational cost to increase when a large amount of data is used. Hence, the FS method plays an important role in building architecture in ML. In the FS method, also called parameter selection, a feature subset is selected from existing features.

The primary objective is to improve the algorithm’s accuracy before and after FS. The FS method assists in resolving the problems by reducing the computation complexity and cost of datasets [8]. Information has been enhanced by using many instances and features, which makes data noisier. The noisier dataset causes the algorithm to reduce the accuracy predicted by models, increases computational costs, increases complexity, and trains the data slower [9]. Consequently, the FS method has become an essential process for ML before training the model. The FS approach focuses on finding a subset from the entire set of features and less downgrade performance of the network; so, the subset of features forecasts the target with performance similar to the accuracy of the novel set of features and reduced of computation cost. Feature selection helps to understand the causes of disease, reduces the computational requirements, and prevents degradation in performance that contributes to better/faster convergence of the deep training method.

The FS model is classified into wrapper—and filter-based models [10]. The FS algorithms use MLs in wrapper-based approaches to check the accuracy of the selected subset of features with high accuracy. However, these approaches could be more effective with high dimensional datasets due to high training time [11]. Subsequently, filter-based approaches use statistical data dependency methods to reach the best subset faster. Filter-based approaches are less accurate, more scalable, faster, and less computationally expensive than wrapper-based approaches [12].

The QMFOFS-HCNN technique aims to improve Parkinson's disease (PD) detection and classification by utilizing Quantum Mayfly Optimization (QMFO) for feature selection, a Convolutional Neural Network with Attention Long Short Term Memory (CNN-ALSTM) for classification, and hyperparameter tuning with the Nadam optimizer. The contributions of the given study are: (i) It uses QMFO to select relevant features, enhancing classification accuracy and reducing computational complexity. (ii) It employs CNN-ALSTM for PD classification, which is well-suited for biomedical time series data with an attention mechanism to capture important information. (iii) It fine-tunes model parameters with Nadam optimizer, improving overall performance. (iv) It demonstrates superior accuracy and detection rates compared to existing methods on benchmark PD datasets. (v) It efficiently selects minimal features while maintaining high accuracy, which is crucial for real world applications. (vi) It is effective across various PD datasets, suggesting broader applicability.

Thus, this study develops a quantum mayfly optimization-based feature subset selection with a hybrid convolutional neural network (QMFOFS-HCNN) technique for PD detection and classification. The principal intention of the QMFOFS-HCNN technique is to identify the optimal feature subsets and enhance the classification accuracy of the PD diagnosis. The QMFOFS-HCNN technique initially designs a novel QMFO approach for the optimum feature choice and resolves the curse of dimensionality problem. In addition, an optimal CNN with attention long short-term memory (CNN-ALSTM) model is employed to detect and classify PD. In order to effectively boost the PD classification outcomes, the Nadam optimizer can be utilized to select the hyperparameters. The experimental validation takes place using the benchmark datasets, and the results are assessed under several aspects.

2 Literature survey

The authors in [13] developed a cloud-based PD predictive model for making medical decisions that assist physicians in identifying the Parkinson-affected person from a remote place. An efficient expanded cat swarm optimization (ECSO)-based FS method has been examined to resolve the problems of data dimensionality. The classification method can considerably enhance the disease predictive performance by utilizing the FS method in the K-nearest neighbour (K-NN). Solana-Lavalle et al. [14] focused on increasing the accuracy and reducing the amount of selected vocal features in PD diagnoses while utilizing the most extensive and newest open-source dataset. While the number of features in this public dataset is 754, the number of selected features for classification ranges from 8 to 20 after utilizing the Wrapper feature subset selection. The KNN, multilayer perceptron (MLP), support vector machine (SVM), and random forest (RF) classifiers are employed for detecting vocal-based PD.

Mathur et al. [15] use different ML methods, which could enhance the efficiency of data sets and play a significant part in making the earlier disease prediction. Afterward, the comparison of this algorithm selects the efficient one in terms of accuracy. The experiment outcomes show that the performance attained from the integrated effects of artificial neural network (ANN) and KNN algorithm is more effective than other approaches. The authors in [11] introduced two NN-based methods, voice impairment classifier and spectrogram detector that focus on helping people and doctors identify disease at earlier stages. A wide-ranging assessment of CNN has been conducted on a large image classifier of gait signal transformed to spectrogram image and deep dense ANN on the voice recording to forecast the disease. El Maachi et al. [12] developed a smart PD method-based deep learning (DL) method for analysing gait data. Then, 1D-Convnet is used to construct the deep neural network (DNN) classification. The presented method processes eighteen 1D signals from the foot sensor, evaluating the classifier. Haq et al. [16] introduced an ML and DNN-based non-invasive predictive model for timely and accurate diagnoses of PD. The ML prediction methods, namely SVM, linear regression (LR), and DNN, have been utilized for classifying healthier people and PD. Zhang et al. [17] proposed an energy direction feature-based empirical mode decomposition (EDF-EMD) feature to display the distinct features of voice signals among healthy and PD patients. At first, the intrinsic mode function (IMF) was attained by using the decomposition of voice signal with empirical mode decomposition.

In Parkinson's disease (PD) research, several previous studies have aimed to diagnose the condition in its early stages but have had limited success [18, 19]. Detecting PD early on is crucial for improving the quality of life for patients. Existing approaches have explored methods such as handwriting analysis, speech assessment, and voice examinations, with handwriting being a preferred choice due to its ease of data collection and affordability [20,21,22]. However, contemporary data sets have grown in size and complexity, introducing noise that can hinder algorithm accuracy, increase computational costs, and slow down data processing [23]. Researchers have turned to feature selection (FS) techniques to address these challenges as a critical step in machine learning (ML) model development. FS helps optimize algorithm performance by selecting a subset of relevant features, reducing computational complexity, and mitigating data noise. The QMFOFS-HCNN technique presented in this study represents a significant advancement in PD detection and classification. It leverages Quantum Mayfly Optimization (QMFO) for feature selection, employs a Convolutional Neural Network with Attention Long Short Term Memory (CNN-ALSTM) for classification, and fine-tunes model parameters using the Nadam optimizer. The key contributions of this research include improved accuracy, feature subset optimization, and enhanced classification performance. Importantly, this technique efficiently selects minimal features while maintaining high accuracy, making it suitable for real-world applications across various PD datasets. Compared to prior work, this study introduces a comprehensive and innovative approach to PD detection, offering the potential for more accurate and efficient diagnoses. While previous research has explored various machine learning methods and feature selection techniques [24], the QMFOFS-HCNN method stands out for its superior accuracy, computational efficiency, and adaptability across diverse PD datasets.

3 Material and methods

3.1 Dataset

The proposed method has been employed with datasets related to Parkinson's disease, encompassing diverse types of sound recordings, as well as data from Parkinson's HandPD, which are as follows:

In the Speech PD dataset, a set of biomedical voice measurements has been gathered from 23 individuals. Each dataset column corresponds to a particular voice measurement, and each dataset row links to one of the 195 recordings of voice taken from these individuals. The main aim of this dataset is to classify between healthy individuals (coded as 0) and those with PD (coded as 1). The dataset was curated by Max Little from the University of Oxford, in partnership with the National Centre for Voice and Speech in Denver, Colorado, where the speech signals were acquired .

In the Voice PD dataset, the training data comprises of records from 20 individuals with PD (14 male and 6 female) and 20 healthy (10 male and 10 female) individuals, who were seen at the department of neurology in Cerrahpasa faculty of medicine, Istanbul University. In data acquisition process, 28 PD patients were suggested to repeat the vowels 'a' and 'o' three times each, ensuing total 168 voice recordings. This dataset acts as a valuable independent test set for results validation obtained from the training dataset.

In the HandPD meander dataset, data has been collected from a total of 158 individuals, including 74 in patient group and 18 in healthy group. Dataset comprises 632 data instances encompassing 13 distinct features. Furthermore, the dataset involves 632 images of meanders drawn by the patients. These individuals represented the age ranges, from 14 to 79 years old. The handwritten examinations were comprised at Botucatu Medical School, São Paulo State University, Brazil.

In the HandPD spiral dataset, participants were recommended to sketch spirals instead of meanders. This dataset comprises of data from 158 individuals. It involves 632 data instances and contains 13 distinct features. The handwritten examinations were collected at Botucatu Medical School, São Paulo State University, Brazil.

3.2 Methods

3.2.1 Design of QMFOFS-HCNN model

This study has developed a novel QMFOFS-HCNN technique for detecting and classifying PD. It aimed to identify the optimal feature subsets and optimize the classification performance of the PD diagnosis. The suggested QMFOFS-HCNN technique encompasses several processes such as QMFO-based feature subset selection, CNN-ALSTM-based classifier, and Nadam-based hyper-parameter tuning. Using QMFOFS and Nadam techniques helps boost the PD classification outcomes effectively. Figure 1 depicts the entire working process of the proposed QMFOFS-HCNN technique. Figure 1 depicts the entire working process of the proposed QMFOFS-HCNN technique. First, the Parkinson dataset has been given as input for its pre-processing in order to remove artefacts. Subsequently, the dataset is divided into the training and testing datasets for providing training and testing. Afterward, a novel QMFO-based feature selection (FS) method has been used to resolve the curse of dimensionality problem by reduction of computational complexity. Information has been enhanced by using many instances and features, which makes data noisier. The noisier dataset causes the algorithm to reduce the accuracy predicted by models, the computational costs, increases the complexity and slows the training process.

Consequently, the FS approach focuses on finding an appropriate subset from the entire set of features with high performance of the network and less computational cost. In order to optimize the efficiency of the MFO algorithm, the QMFO technique is derived; for details, refer to [25]. Subsequently, the Nadam optimizer has been used to boost the classification outcome for hyperparameter tuning. At this stage, the CNN-ALSTM model is employed for PD classification. The CNN‐ALSTM is a hybrid DL approach for extracting features in the raw information and implementing predicting utilizing the LSTM-NN [26]. The CNN uses LSTM for optimum extracting the features of experimental data. The attention method is a procedure for allocating weight. Thus, the proposed work develops a QMFOFS-HCNN technique for PD detection and classification. The primary intention of the proposed technique is to identify the optimal feature subsets and enhance the classification accuracy of the PD diagnosis.

3.2.2 Algorithmic design of QMFOFS technique

The MFO algorithm derives from the social activity of MFs [27]. MFs were generated by adults, and afterward, the fittest lived. Two sets of populations were primarily created. It can signify both males as well as female populations. The candidate is signified by $d$ dimension vector $x=\left({x}_{1},\dots ,{x}_{d}\right)$. The fitness of candidates is estimated by computing the fitness function (FF) $fnfx)$. The velocity $v=({v}_{1},\dots ,{v}_{d})$ has been modified from the candidate place. All the candidates alter their trajectory based on their optimum place (pbest) and an optimum place for every MF (gbest).

Collecting male MFs reflects all males’ knowledge from defining their place in terms of $neighbor{s}^{I}$ places determining ${x}_{i}^{t}$ as present place of candidate solutions $i$ at time $t$, the place was changed by adding a velocity ${v}_{i}^{t+1}$ as [28]:

$${x}_{i}^{t+1}={x}_{i}^{t}+{v}_{i}^{t+1}$$

(1)

With ${x}_{i}^{0} U ({x}_{\mathrm{ min }},{x}_{\mathrm{ max }})$. Considering the minimum velocity of the male population, the velocity is computed as follows:

$${v}_{ij}^{t+1}={v}_{ij}^{t}+{a}_{1}{e}^{-\beta {r}_{p}^{2}} \left(p{{\text{best}}}_{ij}-{x}_{ij}^{t}\right)+{a}_{2}{e}^{-\beta {r}_{g}^{2}} \left(g{{\text{best}}}_{i}-{x}_{ij}^{t}\right),$$

(2)

, where ${v}_{ij}^{t}$ refers to the velocity of MFs $i,$ ${x}_{ij}^{t}$ signifies the place of MFs $i,$ ${a}_{1}$, and ${a}_{2}$ are determined as positive constants signifying the attractive. $pbes{t}_{i}$ stands for the optimum place that candidate solution $i$ had always obtained, and $pbes{t}_{ij}$ at the subsequent step t + 1 was defined in Eq. (3).

$$p{\text{best}}_{i} = \left\{ {\begin{array}{*{20}l} {x_{i}^{t + 1} ,} \hfill & {if f\left( {x_{i}^{t + 1} } \right) < f\left( {p{\text{best}}_{i} } \right)} \hfill \\ {{\text{same as before}},} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.$$

(3)

, where $f:{\mathbb{R}}^{n}\Rightarrow {\mathbb{R}}$ refers to the function minimizing, $gbest$ signifies the global optimum attained from the issue ever at time $t.$ The co-efficient in Eq. (2) limits the $populatio{n}^{I}s$ visibility. ${r}_{p}$ implies the distance among ${x}_{i}$ and $pbes{t}_{i}.$ In the meantime, ${r}_{g}$ determines the distance in ${x}_{i}$ to gbest. ${r}_{p}$ and ${r}_{g}$ are defined in Eq. (4).

$$\Vert {x}_{i}-{X}_{i}\Vert =\sqrt{{\sum }_{j=1}^{n}({x}_{ij}-{X}_{ij}{)}^{2}}$$

(4)

, where ${x}_{ij}$ refers the ${j}^{{\text{th}}}$ component of ${i}^{{\text{th}}}$ candidate. ${X}_{i}$ is connected to pbest.

An optimum fit candidate keeps implementing up and down motions by different velocities. The velocity is defined as in Eq. (5).

$$v_{ij}^{t + 1} = v_{ij}^{t} + d \times r$$

(5)

, where d denotes the co-efficient compared with up and down motions, and $r$ represents the arbitrary value between $-1$ and 1. Figure 2 demonstrates the flowchart of the MFO technique.

The female MFs do not gather, but they move near males. Assume that ${y}_{i}^{t}$ is the present place of female MF $i$ at time $t$. The alteration from the place was computed as:

$$y_{i}^{t + 1} = y_{i}^{t} + v_{i}^{t + 1}$$

(6)

With ${y}_{i}^{0} U ({x}_{\mathrm{ min }},{ x}_{\mathrm{ max }})$. The female MFs’ velocity is defined as in Eq. (7).

$$v_{ij}^{t + 1} = \left\{ {\begin{array}{*{20}l} {v_{ij}^{t} + a_{2} e^{{ - \beta r_{mf}^{2} }} \left( {x_{ij}^{t} - y_{ij}^{t} } \right),} \hfill & if \quad {f\left( {y_{i} } \right) > f\left( {x_{i} } \right)} \hfill \\ {v_{ij}^{t} + fl \times r,} \hfill & {if \quad f\left( {y_{i} } \right) \le f\left( {x_{i} } \right)} \hfill \\ \end{array} } \right.$$

(7)

, where ${v}_{ij}^{t}$ refers to the velocity of ${i}^{th}$ female at time $t,$ ${y}_{ij}^{t}$ signifies the place of ${i}^{th}$ female candidate solution at time $t,$ ${a}_{2}$ signifies the positive constants, $\beta$ stands for the set co-efficient, ${r}_{{\text{mf}}}$ indicates the distance between the male candidate solution and female ones that are calculated utilizing in Eq. (4), $fl$ signifies the co-efficient that relates the female which is not attractive. $r$ implies the arbitrary number between $-1$ and 1. The mating was demonstrated by an operator that is a crossover operator. The pair of male, as well as female parents are selected.

$$\begin{gathered} {\text{off}}\;{\text{spring}}\;{\text{l }} = L \times {\text{male}} + \left( {1 - L} \right) \times {\text{female}}\; \hfill \\ {\text{off}}\;{\text{spring}}\;2 = L \times {\text{female}} + \left( {1 - L} \right) \times {\text{male}} \hfill \\ \end{gathered}$$

(8)

, where $L$ refers to the arbitrary number. Primarily, the velocity of offspring is equivalent to 0. In order to optimize the efficiency of the MFO algorithm, the QMFO technique is derived [25]. With the quantization of grasshopper individuals, the feature search space has improved to balance exploitation and exploration. A vital unit of QC is qubit. The two important forms $|0>$ and $|1>$ way a qubit which has been formulated as a linear grouping of these two essential forms as:

$$|Q>=\alpha |0>+\beta |1>.$$

(9)

${|\alpha |}^{2}$ refers the probability of identifying form $|0>$, ${|\beta |}^{2}$ signifies the probability of detecting state $|1>$, where ${|\alpha |}^{2}+{|\beta |}^{2}=1.$ The quantum is composed of $n$ qubits. Because of the form of quantum superposition, all quantum has ${2}^{n}$ probable values.

$$\Psi = \sum\limits_{{x = 0}}^{{2^{n} - 1}} {C_{x} |x > } ,\sum\limits_{{x = 0}}^{{2^{n} - 1}} {|C_{x} |^{2} } = 1.$$

(10)

Quantum gates have modified the state of qubits as Hadamard, rotation, and NOT gates, among others. The rotation gate was explained as a mutation function to make quanta model optimal solutions and finally determined the global optimal solutions.

The rotation gate is shown as follows:

$$\left[ {\begin{array}{*{20}c} {\alpha^{d} \left( {t + 1} \right)} \\ {\beta^{d} \left( {t + 1} \right)} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {{\text{cos}}\left( {\Delta \theta^{d} } \right)} & { - {\text{sin}}\left( {\Delta \theta^{d} } \right)} \\ {{\text{sin}}\left( {\Delta \theta^{d} } \right)} & {{\text{cos}}\left( {\Delta \theta^{d} } \right)} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\alpha^{d} \left( t \right)} \\ {\beta^{d} \left( t \right)} \\ \end{array} } \right]{\text{for }}d = 1,2, \ldots ,n.$$

(11)

$\Delta \theta^{d} = \Delta \times S \left( {\alpha^{d} , \beta^{d} } \right)$, $\Delta \theta^{d}$ stands for the rotation angle of qubit, whereas $\Delta$ and $S\left( {\alpha^{d} , \beta^{d} } \right)$ are size and way of rotation correspondingly.

The mathematical model of the QMFOFS approach was established. Generally, some data sets’ classification (i.e. supervised learning) is size ${N}_{S}\times {N}_{F}$, whereas ${N}_{S}$ refers to the number of instances, and ${N}_{F}$ implies the number of features. An important objective of the FS issue is to select a subset of features $S$ in the entire amount of features $({N}_{F})$ in which the size of $S$ is lesser than ${N}_{F}$. It is obtained by minimizing the subsequent primary function:

$${\text{Fit}}=\lambda \times {\gamma }_{S}+\left(1-\lambda \right)\times \left(\frac{\left|S\right|}{{N}_{F}}\right)$$

(12)

, where ${\gamma }_{S}$ denotes the classification error utilizing $S$ and $|S|$ is the amount of chosen features. $\lambda$ is utilized for balancing among $\left(\frac{\left|S\right|}{{N}_{P}}\right)$ and ${\gamma }_{S}.$

3.2.3 The process involved in CNN-ALSTM -based classification

At this stage, the CNN-ALSTM model is employed for PD classification. The CNN‐ALSTM is a hybrid DL approach for extracting features in the raw information and implementing predicting utilizing the LSTM-NN [26]. The CNN layer has been utilized for extracting the suitable features in the time series data, demonstrating extra hidden data has the potential for improving the forecast accuracy. The experimental outcomes illustrate that the CNN layer comprises one 16 $3\times 1$ convolutional kernel layer and one 32 $3\times 1$ convolutional kernel layer, which optimizes the forecast efficiency. The feature vector attained in the secondary layer of CNN is input to the LSTM layer to forecast. All the elements of feature vectors are similar to most 32 units from the LSTM layer. The attention process sets the superior weight to feature quantity, undoubtedly associated with the present output. Eventually, the FC layer managed the resultant vector of the attention process utilizing the unfolding function. The forecasted value of AC2 at the following moment was the outcome. The LSTM is well suited to forecast experimental time series data. The recent mechanism depicts the maximum predicting efficiency relating CNN and LSTM to distinct applications. The CNN uses LSTM for optimum extracting the features of experimental data. The attention method is a procedure for allocating weight. Inverse normalized prediction power was attained based on Eq. (13).

$${{\text{Pr}}}_{P}={{\text{Pr}}}_{Iac2}*\left({P}_{{\text{max}}}-{P}_{{\text{min}}}\right)+{P}_{{\text{min}}},$$

(13)

where ${{\text{Pr}}}_{p}$ refers to the forecasted value of powers and ${{\text{Pr}}}_{Iac2}$ signifies the forecasted value of $AC2.$

The presence of LSTM cell infrastructure efficiently solves the gradient explosion or vanishing issues. There are four essential components from the flowchart of the LSTM technique: cell status, output, input, and forget gates. Those gates were utilized to control the upgrading, maintaining, and deleting of data from cell status. The forward computation procedure is referred to as:

$$\begin{gathered} f_{z} = \sigma \left( {W_{f} \cdot \left[ {h_{z - 1} , x_{z} } \right] + b_{f} } \right), \hfill \\ i_{z} = \sigma \left( {W_{j} \cdot \left[ {h_{z - 1} , x_{z} } \right] + b_{i} } \right), \hfill \\ {\text{O}}_{{\text{z}}} = \sigma \left( {W_{O} \cdot \left[ {h_{z - 1} , x_{z} } \right] + b_{o} } \right), \hfill \\ \tilde{C}_{z} = {\text{ tanh }}\left( {W_{C} \cdot \left[ {h_{z - 1} , x_{z} } \right] + b_{c} } \right), \hfill \\ C_{z} = f_{z} \cdot C_{z - 1} + i_{z} \cdot \tilde{C}_{z} \hfill \\ h_{z} = O_{z} \cdot {\text{tanh}}\left( {C_{z} } \right), \hfill \\ \end{gathered}$$

(14)

, where ${W}_{f},$ ${W}_{j}$, and ${W}_{o}$ refer to the weight matrix of forgetting, input, and output gates correspondingly; ${b}_{f},$ ${b}_{j},$ and ${b}_{o}$ signifies the offset item of forget, input, and output gates correspondingly; $\sigma$ signifies the sigmoid activation functions; ${\text{tanh}}$ denotes the hyperbolic tangent activation functions.

The attention process is a brain signal-processing method peculiar to human vision. It rapidly scans the global image to obtain the destination region, which requires attention and ignores other regions of unnecessary data. The attention process technique was effectively executed and implemented to train the model and other connected areas. The proposed model utilizes the LSTM hidden neuron resultant vector $H=\{{h}_{1}, {h}_{2},\cdots ,{h}_{t}\}$ as input of the attention process, and the attention process will determine the attention weight ${\alpha }_{i}$ of ${h}_{i}$ that is computed as shown in Eq. (15).

$$\begin{gathered} e_{i} = {\text{ tanh }}\left( {W_{h} h_{i} + b_{h} } \right), \hfill \\ \alpha_{i} = \frac{{{\text{ exp }}\left( {e_{i} } \right)}}{{\mathop \sum \nolimits_{i = 1}^{t} {\text{ exp }}\left( {e_{i} } \right)}}, \hfill \\ \end{gathered}$$

(15)

whereas ${\alpha }_{i}$ signifies attention to weight, ${W}_{h}$ refers to the weight matrix of ${h}_{j}$, and ${b}_{h}$ represents the bias.

3.3 Hyperparameter tuning

For optimally tuning the hyperparameters of the CNN-ALSTM model, the Nadam optimizer is used. Nadam is an extended version of Adam optimizer [29], which can be applied to optimize the efficiency of the DL approaches. The upgrading rules of the Adam optimizer can be attained using the following equations:

$$g_{t} = \nabla_{{\theta_{f} }} J\left( {\theta_{t} } \right){ }$$

(16)

$$m_{t} = \beta_{1} m_{t - 1} + \left( {1 - \beta_{1} } \right)g_{t} { }$$

(17)

$$v_{t} = \beta_{2} v_{t - 1} + \left( {1 - \beta_{2} } \right)g_{t}^{2}$$

(18)

$$\hat{m}_{{\text{t}}} = \frac{{m_{t} }}{{1 - \beta_{1}^{t} }}{ }$$

(19)

$$\hat{v}_{{\text{t}}} = \frac{{v_{t} }}{{1 - \beta_{2}^{t} }}$$

(20)

$$\theta_{t + 1} = \theta_{t} - \frac{\eta }{{\sqrt {\hat{v}} + \varepsilon }}\hat{m}_{t}$$

(21)

, where ${g}_{t}$ indicates the gradient vector of the CNN-ALSTM model at the time of training; $\eta$ denotes the learning rate of the CNN-ALSTM model training; $J({\theta }_{t})$ is the divider function of the CNN in the CNN-ALSTM model; ${\nabla }_{{\theta }_{t}}$ is the partial derivative of $J({\theta }_{t})$ and $\theta ,$ ${m}_{t}$ and ${v}_{t}$ denotes 1st and 2nd order moment of the gradient at the time of training the CNN-ALSTM model; ${m}_{t}$ and $\widehat{v}$ represents the deviation corrections of ${m}_{t}$ and ${v}_{t}$, that can be utilized for offsetting the variation; ${\beta }_{1}$ and ${\beta }_{2}$ indicate exponential decay rate of ${m}_{t}$ and ${v}_{t},$ $\varepsilon$ is the correction variable used for ensuring that the denominator is not zero; $t$ represents the number of iterations involved in the training process of the CNN-ALSTM model. Utilizing Eq. (17) into Eqs. (19) and (21) provides,

$${\theta }_{t+1}={\theta }_{t}-\frac{\eta }{\sqrt{\widehat{v}+\varepsilon }}\left(\frac{{\beta }_{1}{m}_{t-1}}{1-{\beta }_{1}^{t}}+\frac{(1-{\beta }_{1}){g}_{t}}{1-{\beta }_{1}^{t}}\right)$$

(22)

The ${m}_{t-1}/1-{\beta }_{1}^{t}$ presents the deviation correction estimation of the momentum vector at an earlier moment of the CNN-ALSTM model that can be attained by substituting ${m}_{t-1}$ with:

$${\theta }_{t+1}={\theta }_{t}-\frac{\eta }{\sqrt{\widehat{v}+\varepsilon }}\left({\beta }_{1}{\widehat{m}}_{t-1}+\frac{(1-{\beta }_{1}){g}_{t}}{1-{\beta }_{1}^{t}}\right)$$

(23)

With the addition of the Nesterov momentum, the deviation correction estimation ${m}_{t}$ of the present momentum vector of the CNN-ALSTM model is straightaway utilized for replacing the deviation corrected estimates ${m}_{t-1}$ of the earlier momentum that results in the updating rule of the Nadam, as provided below.

$${\theta }_{t+1}={\theta }_{t}-\frac{\eta }{\sqrt{\widehat{v}+\varepsilon }}({\beta }_{1}\widehat{m}+\frac{(1-{\beta }_{1}){g}_{t}}{1-{\beta }_{1}^{t}}$$

(24)

The conventional momentum approach has the demerit that the learning rate remains the same in the training procedure and utilizes an individual learning rate for updating weights.

4 Experimental validation

The performance validation of the QMFOFS-HCNN technique uses four benchmark datasets: HandPD meander, HandPD spiral, voice PD, and speech PD [30] using various evaluation metrics. The metrics used for performance evaluation are accuracy, detection, and false alarm rate. The accuracy rate is defined as the proportion of observations that have been correctly classified. A detection rate is defined as an outcome where the model correctly predicts the positive class. It measures the percentage of actual positives that are correctly identified. A false alarm rate (FAR) is defined as an outcome where the model incorrectly predicts the positive class.

Figure 3 shows the FS results of the QMFOFS-HCNN system with existing methods on four data sets. The results showed that the QMFOFS-HCNN technique has shown an effectual outcome by selecting the least number of features. For instance, under HandPD Spiral dataset, the QMFOFS-HCNN technique has elected three features while the modified grasshopper optimization algorithm (MGOA) [31], modified grey wolf optimizer (MGWO) [32], optimized cuttlefish algorithm (OCFA) [30], and improved sailfish optimization algorithm with deep learning (IFSO-DL) [33] systems have selected 5, 7, 8, and 4 features correspondingly. Similarly, under the voice PD dataset, the QMFOFS-HCNN technique has picked six features, while the MGOA, MGWO, OCFA, and IFSO-DL systems have elected 8, 9, 17, and 7 features, respectively.

Table 1 demonstrates the comparative PD detection analysis of the QMFOFS-HCNN system with existing approaches on the HandPD spiral and HandPD Meander dataset [31, 33].

Table 1 Results analysis of existing with proposed model on HandPD spiral dataset and HandPD meander dataset

Full size table

Figure 4 exhibits the comparative ${{\text{accu}}}_{y}$ analysis of the QMFOFS-HCNN system with existing techniques on HandPD spiral and HandPD Meander datasets. The results show that the QMFOFS-HCNN technique has accomplished enhanced classification outcomes with higher accuracy than the other techniques on both datasets. For instance, on HandPD spiral datasets, the QMFOFS-HCNN technique has reached to maximum $acc{u}_{y}$ of 96.35%, whereas the MGOA-KNeN, MGOA-RANDF, MGOA-DT (C4.5), MGWO-KNeN, MGWO-RANDF, MGWO-DT (C4.5), and IFSO-DL techniques have obtained minimum ${{\text{accu}}}_{y}$ values of 75.54%, 92.62%, 89.88%, 74.13%, 92.62%, 92.03%, and 93.61%, respectively.

Figure 5 demonstrates the comparison study of the QMFOFS-HCNN technique with recent models in terms of detection rate ${d}_{{\text{rate}}}$ on HandPD spiral and HandPD Meander datasets. The experimental values indicated that the QMFOFS-HCNN system has demonstrated improved classifier results with the maximum ${d}_{{\text{rate}}}$ values over the other techniques on both datasets. For instance, on HandPD spiral dataset, the QMFOFS-HCNN technique has offered increased ${d}_{{\text{rate}}}$ of 99.22%, whereas the MGOA-KNeN, MGOA-RANDF, MGOA-DT (C4.5), MGWO-KNeN, MGWO-RANDF, MGWO-DT (C4.5), and IFSO-DL techniques have resulted in reduced ${d}_{{\text{rate}}}$ values of 84.89%, 97.99%, 95.58%, 82.54%, 94.99%, 93.65%, and 98.04%, respectively.

Figure 6 provides the accuracy and loss graph analysis of the QMFOFS-HCNN system under HandPD spiral and HandPD meander datasets. The outcomes shown that the accuracy value tends to be higher, and the loss value tends to decrease with an increase in epoch count. It is also observed that the training loss is low, and validation accuracy is maximum on HandPD spiral and HandPD meander datasets. Table 2 demonstrates the comparative PD detection result analysis of the QMFOFS-HCNN technique with existing approaches on the speech PD and voice datasets. Figure 7 depicts the comparative ${{\text{accu}}}_{y}$ analysis of the QMFOFS-HCNN technique with existing methods on speech PD and voice PD datasets. The results showed that the QMFOFS-HCNN system has accomplished enhanced classification outcomes with higher accuracy than the other techniques on both datasets. For instance, on the speech PD dataset, the QMFOFS-HCNN technique has reached to maximum ${{\text{accu}}}_{y}$ of 98.50%, whereas the MGOA-KNeN, MGOA-RANDF, MGOA-DT (C4.5), MGWO-KNeN, MGWO-RANDF, MGWO-DT (C4.5), and IFSO-DL approaches have obtained lesser ${{\text{accu}}}_{y}$ values of 89.69%, 95.56%, 85.53%, 92.35%, 93.64%, 90.18%, and 96.19%, correspondingly.

Table 2 Results analysis of existing with proposed model on speech PD dataset and voice PD dataset

Full size table

Figure 8 examines the comparison study of the QMFOFS-HCNN approach with recent models in terms of detection rate ${d}_{{\text{rate}}}$ on speech PD and voice PD datasets. The experimental values indicated that the QMFOFS-HCNN system had outperformed higher classifier results with higher ${d}_{{\text{rate}}}$ values over the other techniques on both datasets. For instance, on speech PD dataset, the QMFOFS-HCNN approach has offered increased ${d}_{{\text{rate}}}$ of 99.98%, whereas the MGOA-KNeN, MGOA-RANDF, MGOA-DT (C4.5), MGWO-KNeN, MGWO-RANDF, MGWO-DT (C4.5), and IFSO-DL systems have resulted in lower ${d}_{{\text{rate}}}$ values of 96.56%, 90.17%, 97.21%, 99.95%, 94.28%, 99.16%, and 99.98%, correspondingly.

Figure 9 offers the accuracy and loss graph analysis of the QMFOFS-HCNN methodology under speech PD and voice PD Datasets. The outcomes outperformed that the accuracy value tends to increase, and the loss value tends to reduce with a higher epoch count. It can also be observed that the training loss is lesser, and validation accuracy is high on speech PD and voice PD Datasets. From these results, it is ensured that the proposed model is superior to other methods of PD classification.

The study compared the QMFOFS-HCNN technique to existing methods for Parkinson's disease (PD) detection using four benchmark datasets. The QMFOFS-HCNN technique demonstrated several strengths: QMFOFS-HCNN selected fewer features while maintaining or improving classification performance, reducing data dimensionality, thus, providing Efficient Feature Selection. It consistently outperformed existing methods in accuracy, enhancing PD patient classification and, thus, high accuracy. It achieved higher detection rates, crucial for accurate PD diagnosis, and exhibited lower FAR, reducing the risk of misdiagnosis. It consistently outperformed existing methods across various datasets, demonstrating its versatility. The model showed increasing accuracy and decreasing loss during training, indicating effective learning.

5 Conclusion

This study has developed a novel QMFOFS-HCNN method for detecting and classifying PD. It aimed to identify the optimal feature subsets and enhance the classification accuracy of the PD diagnosis. The proposed QMFOFS-HCNN technique encompasses several processes, such as QMFO-based feature selection, CNN-ALSTM based classification, and Nadam-based hyperparameter tuning. Using QMFOFS classify Nadam techniques helps to boost the PD classification outcomes effectively. The experimental validation takes place using the benchmark datasets, and the results are assessed under several aspects. The comparative results indicated the QMFOFS-HCNN technique’s promising performance in several evaluation metrics. Therefore, the QMFOFS-HCNN technique can be utilized as a proficient tool for PD detection and classification. It offers a proficient PD detection and classification tool, contributing to medical diagnostics. However, it is important to acknowledge some limitations of this study. Firstly, the performance evaluation was conducted on benchmark datasets, and the real-world applicability of the technique may require further validation with diverse and more extensive datasets. Secondly, while the QMFOFS-HCNN method shows promise, it may benefit from additional optimization and fine-tuning to achieve even higher accuracy levels.

In the future, outlier detection techniques can be incorporated into the QMFOFS-HCNN technique to improve the classifier results and robustness. Additionally, exploring the integration of real time data collection and analysis for PD diagnosis could improve the practicality and timeliness of the method. Overall, this study lays the foundation for more advanced and effective PD diagnostic tools, and further refinement and validation in clinical settings will be essential for its successful implementation.

Data availability

The dataset used in this study is publicly available via the following link: https://wwwp.fc.unesp.br/∼papa/pub/datasets/Handpd/.

Abbreviations

AI:: Artificial intelligence
DL:: Deep learning
ECSO:: Expanded cat swarm optimization
EDF-EMD:: Energy direction feature based empirical mode decomposition
MGOA:: Modified grasshopper optimization algorithm
MGWO:: Modified grey wolf optimizer
OCFA:: Optimized cuttlefish algorithm
IFSO-DL:: Improved sailfish optimization algorithm with deep learning
CNN-ALSTM:: Convolutional neural network with attention-based long short-term memory
QMFOFS-HCNN:: Quantum mayfly optimization-based feature subset selection with hybrid convolutional neural network
RF:: Random forest
SVM:: Support vector machine
K-NN:: K-nearest neighbour
MLP:: Multilayer perceptron
PD:: Parkinson's disease
MF:: Male and female flies
DNN:: Deep neural network
FAR:: False alarm rate
FS:: Feature selection
ANN:: Artificial neural network
IMF:: Intrinsic mode function
FF:: Fitness function
QMFO:: Quantum mayfly optimization

References

Ali L, Zhu C, Golilarz NA, Javeed A, Zhou M, Liu Y (2019) Reliable Parkinson’s disease detection by analyzing handwritten drawings: construction of an unbiased cascaded learning system based on feature selection and adaptive boosting model. Ieee Access 7:116480–116489
Article Google Scholar
Pahuja G, Nagabhushan TN (2021) A comparative study of existing machine learning approaches for Parkinson’s disease detection. IETE J Res 67(1):4–14
Article Google Scholar
Zahid L, Maqsood M, Durrani MY, Bakhtyar M, Baber J, Jamal H, Mehmood I, Song OY (2020) A spectrogram-based deep feature assisted computer-aided diagnostic system for Parkinson’s disease. IEEE Access 8:35482–35495
Article Google Scholar
Senturk ZK (2020) Early diagnosis of Parkinson’s disease using machine learning algorithms. Med Hypotheses 138:109603
Article Google Scholar
Wang W, Lee J, Harrou F, Sun Y (2020) Early detection of Parkinson’s disease using deep learning and machine learning. IEEE Access 8:147635–147646
Article Google Scholar
Wroge TJ, Özkanca Y, Demiroglu C, Si D, Atkins DC, Ghomi RH (2018) Parkinson’s disease diagnosis using machine learning and voice. In: 2018 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pp 1–7. IEEE.
Lahmiri S, Dawson DA, Shmuel A (2018) Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures. Biomed Eng Lett 8(1):29–39
Article Google Scholar
Rovini E, Maremmani C, Moschetti A, Esposito D, Cavallo F (2018) Comparative motor pre-clinical assessment in Parkinson’s disease using supervised machine learning approaches. Ann Biomed Eng 46(12):2057–2068
Article Google Scholar
Belić M, Bobić V, Badža M, Šolaja N, Đurić-Jovičić M, Kostić VS (2019) Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg 184:105442
Article Google Scholar
Ouhmida A, Raihani A, Cherradi B, Terrada O (2021) A novel approach for Parkinson’s disease detection based on voice classification and features selection techniques. Int J Online & Biomed Eng 17(10):111
Article Google Scholar
Johri A, Tripathi A (2019) Parkinson disease detection using deep neural networks. In: 2019 Twelfth International Conference on Contemporary Computing (IC3), pp 1–4. IEEE.
El Maachi I, Bilodeau GA, Bouachir W (2020) Deep 1D-Convnet for accurate Parkinson disease detection and severity prediction from gait. Expert Syst Appl 143:113075
Article Google Scholar
Jayaram R, Senthil Kumar T (2022) Cloud-based Parkinson disease prediction system using expanded cat swarm optimization. IoT and analytics for sensor networks. Springer, Singapore, pp 299–309
Chapter Google Scholar
Solana-Lavalle G, Galán-Hernández JC, Rosas-Romero R (2020) Automatic Parkinson disease detection at early stages as a pre-diagnosis tool by using classifiers and a small set of vocal features. Biocybern Biomed Eng 40(1):505–516
Article Google Scholar
Mathur R, Pathak V, Bandil D (2019) Parkinson disease prediction using machine learning algorithm. Emerging trends in expert applications and security. Springer, Singapore, pp 357–363
Chapter Google Scholar
Haq AU, Li J, Memon MH, Khan J, Din SU, Ahad I, Sun R, Lai Z, (2018) Comparative analysis of the classification performance of machine learning classifiers and deep neural network classifier for prediction of Parkinson disease. In: 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp 101–106. IEEE.
Zhang T, Zhang Y, Sun H, Shan H (2021) Parkinson disease detection using energy direction features based on EMD from voice signal. Biocybern Biomed Eng 41(1):127–141
Article Google Scholar
Vidya B, Sasikumar P (2022) Parkinson’s disease diagnosis and stage prediction based on gait signal analysis using EMD and CNN–LSTM network. Eng Appl Artif Intell 114:105099
Article Google Scholar
Klaar ACR, Stefenon SF, Seman LO, Mariani VC, Coelho LDS (2023) Optimized EWT-Seq2Seq-LSTM with attention mechanism to insulators fault prediction. Sensors 23(6):3202
Article Google Scholar
Borré A, Seman LO, Camponogara E, Stefenon SF, Mariani VC, Coelho LDS (2023) Machine fault detection using a hybrid CNN-LSTM attention-based model. Sensors 23(9):4512
Article Google Scholar
Balaji E, Brindha D, Elumalai VK, Vikrama R (2021) Automatic and non-invasive Parkinson’s disease diagnosis and severity rating using LSTM network. Appl Soft Comput 108:107463
Article Google Scholar
Koundal D, Jain DK, Guo Y, Ashour AS, Zaguia A (Eds.) (2023) Data analysis for neurodegenerative disorders. Springer Nature.
Oktay AB, Kocer A (2020) Differential diagnosis of Parkinson and essential tremor with convolutional LSTM networks. Biomed Signal Process Control 56:101683
Article Google Scholar
Khan YF, Kaushik B, Koundal D (2023) Machine learning models for alzheimer’s disease detection using medical images. Data analysis for neurodegenerative disorders. Singapore Springer Nature, Singapore, pp 165–182
Chapter Google Scholar
Wang D, Chen H, Li T, Wan J, Huang Y (2020) A novel quantum grasshopper optimization algorithm for feature selection. Int J Approx Reason 127:33–53
Article MathSciNet Google Scholar
Sethi M, Ahuja S, Rani S, Koundal D, Zaguia A, Enbeyle W (2022) An Exploration: alzheimer’s disease classification based on convolutional neural network. BioMed Res Int. https://doi.org/10.1155/2022/8739960
Article Google Scholar
Zervoudakis K, Tsafarakis S (2020) A mayfly optimization algorithm. Comput Ind Eng 145:106559
Article Google Scholar
Shaheen MA, Hasanien HM, El Moursi MS, El-Fergany AA (2021) Precise modeling of PEM fuel cell using improved chaotic MayFly optimization algorithm. Int J Energy Res 45(13):18754–18769
Article Google Scholar
Bera S, Shrivastava VK (2020) Analysis of various optimizers on deep convolutional neural network model in the application of hyperspectral remote sensing image classification. Int J Remote Sens 41(7):2664–2683
Article Google Scholar
Gupta D, Julka A, Jain S et al (2018) Optimized cuttlefish algorithm for diagnosis of Parkinson’s disease. Cogn Syst Res 52:36–48
Article Google Scholar
Sehgal S, Agarwal M, Gupta D, Sundaram S, Bashambu A (2020) Optimized grass hopper algorithm for diagnosis of Parkinson’s disease. SN Appl Sci 2(6):1–18
Article Google Scholar
Sharma P, Sundaram S, Sharma M, Sharma A, Gupta D (2019) Diagnosis of Parkinson’s disease using modified grey wolf optimization. Cognit Syst Res 54:100–115
Article Google Scholar
Zhang Y, Mo Y (2022) Chaotic adaptive sailfish optimizer with genetic characteristics for global optimization. J Supercomput 78:10950–10996. https://doi.org/10.1007/s11227-021-04255-9

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Mathematics, Faculty of Science, New Valley University, El-Kharga, 72511, Egypt
Romany F. Mansour

Authors

Romany F. Mansour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Romany F. Mansour.

Ethics declarations

Conflict of interest

The author declares that I have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mansour, R.F. Quantum mayfly optimization based feature subset selection with hybrid CNN for biomedical Parkinson’s disease diagnosis. Neural Comput & Applic 36, 8383–8396 (2024). https://doi.org/10.1007/s00521-024-09516-1

Download citation

Received: 16 January 2022
Accepted: 14 January 2024
Published: 22 February 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00521-024-09516-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Quantum mayfly optimization based feature subset selection with hybrid CNN for biomedical Parkinson’s disease diagnosis

Abstract

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

1 Introduction

2 Literature survey