RETRACTED ARTICLE: Prediction of gestational diabetes based on explainable deep learning and fog computing

El-Rashidy, Nora; ElSayed, Nesma E.; El-Ghamry, Amir; Talaat, Fatma M.

doi:10.1007/s00500-022-07420-1

RETRACTED ARTICLE: Prediction of gestational diabetes based on explainable deep learning and fog computing

Data analytics and machine learning
Open access
Published: 27 August 2022

Volume 26, pages 11435–11450, (2022)
Cite this article

Download PDF

You have full access to this open access article

Soft Computing Aims and scope Submit manuscript

RETRACTED ARTICLE: Prediction of gestational diabetes based on explainable deep learning and fog computing

Download PDF

Nora El-Rashidy ORCID: orcid.org/0000-0001-8177-9439¹,
Nesma E. ElSayed²,
Amir El-Ghamry³ &
…
Fatma M. Talaat¹

3578 Accesses
15 Citations
1 Altmetric
Explore all metrics

This article was retracted on 10 August 2023

This article has been updated

Abstract

Gestational diabetes mellitus (GDM) is one of the pregnancy complications that endangers both mothers and babies. GDM is usually diagnosed at 22–26 weeks of gestation. However, early prediction is preferable because it may decrease the risk. The continuous monitoring of the mother’s vital signs helps in predicting any deterioration during pregnancy. The originality of this research is to provide a comprehensive framework for pregnancy women monitoring. The proposed Data Replacement and Prediction Framework consists of three layers, which are: (i) Internet of things (IoT) Layer, (ii) Fog Layer, and (iii) Cloud Layer. The first layer used IoT sensors to aggregate vital signs from pregnancies using invasive and non-invasive sensors. The vital signs are then transmitted to fog nodes to be processed and finally stored in the cloud layer. The main contribution in this research is located in the fog layer producing the GDM module to implement two influential tasks which are as follows: (i) Data Finding Methodology (DFM), and (ii) Explainable Prediction Algorithm (EPM) using DNN. First, the DFM is used to replace the unused data to free up the cache space for new incoming data items. The cache replacement is very important in the case of the healthcare system as the incoming vital signs are frequent and must be replaced continuously. Second, the EPM is used to predict the occurrence of GDM in the second trimester of the pregnancy. To evaluate our model, we extracted data from 16,354 pregnant women from the medical information mart for intensive care (MIMIC III) benchmark dataset. For each woman, vital signs, demographic data, and laboratory tests were aggregated. The results of the prediction model are superior to the state-of-the-art (ACC = 0.957, AUC = 0.942). Regarding explainability, we used Shapley additive explanation (SHAP) framework to provide local and global explanations for the developed models. Overall, the proposed framework is medically intuitive and allows the early prediction of GDM with a cost-effective solution.

Utilizing fog computing and explainable deep learning techniques for gestational diabetes prediction

Article Open access 20 December 2022

Remote Patient Monitoring Using IoT, Cloud Computing and AI

HeartFog: Fog Computing Enabled Ensemble Deep Learning Framework for Automatic Heart Disease Diagnosis

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

GDM is a pregnancy-related irregular glucose-level status. It is a common pregnancy complication that is recognized in 3–10% of pregnancies (Risk et al. 2007; Zhu and Zhang 2016). GDM is usually diagnosed between 22 and 26 weeks of gestation and may result in high-risk complications for both women and infants. These complications include respiratory problems, metabolic disorders, premature delivery, and the fetus gaining weight that may hamper the birthing process. Although GDM normally goes away after birth, women are still at a high risk of developing type 2 diabetes, with a cumulative incidence of 30–50% within 5–10 years following the index pregnancy (Zhu and Zhang 2016; Egan et al. 2021). Several studies have found that high-risk complications can be prevented if the medical intervention begins in the first or second trimester (Christophi et al. 2008).

Therefore, early detection of GDM is critical for averting a range of problems, including: (i) Evidence suggests that pre-diabetes treatment response varies when GDM history is taken into consideration (Christophi et al. 2008; Aroda, et al. 2015). (ii) Individualized risk prediction and treatment response estimation could also help in the selection of pre-diabetes treatment (Kent 2015; Herman, et al. 2017). (iii) Women with a history of GDM can learn about their future diabetes risk and how metformin and/or ILI might help (Herman et al. 2017; Intervention and Metformin 2006). Individual risk estimation could help doctors make better clinical decisions and make diabetes prevention programs more efficient, cost-effective, and patient-centered (Kent 2015; Herman et al. 2017).

There are several models available to assess the risk of developing diabetes in the general population (Mathur et al. 2011; Lindstrom 2008). However, few people use multivariable models to help customize preventive treatments to specific people (Kent 2015; Herman et al. 2017) Lindstrom 2008; Costa et al. 2012). Predictor variables in models specifically designed for women with previous GDM frequently incorporate measurements taken during or shortly after pregnancy (e.g., insulin use during pregnancy or breastfeeding history) (Lindstorm 2008; Ekelund et al. 2009; Ignell et al. 2016).

In the last decades, several studies have used data from electronic health records (EHR) to diagnose and forecast patient future events, such as mortality prediction (Awad et al. 2017; El-rashidy et al. 2020), sepsis prediction (Adams et al. 2015; El-Rashidy et al. 2021a), predict heart problems (Forkan and Khalil 2016), GDM complications (Zhang et al. 2020; Ahmadi and Mirbagheri 2019), etc. However, only a few studies have been conducted to predict GDM (Burlina et al. 2016; Savvidou et al. 2010). For example: (i) a recent study developed a model to predict GDM, which includes pregnancy body mass index (BMI), and gestational age fasting glucose. (ii) Xiong et al. (Zheng et al. 2019a) decided to develop a risk prediction mechanism for the first 19 weeks using high-potential GDM predictors (light GBM), a support vector Machine (SVM) and a light gradient boosting Machine (iii) Zheng et al. (2019b) presented a simple method for detecting GDM in early pregnancy using biochemical markers and the ML method. (iv) According to Shen et al. (2020), the research on the best AI approach for GDM prediction required the least number of clinical devices and trainees to construct an AI-based application (AI). (D Care 2018; Qiu et al. 2017).

Other research (Wu et al. 2021; Nuzzo et al. 2021) sought to develop models based on risk factors discovered in the first trimester that can predict an abnormal OGTT at 24–28 weeks. They considered various indicators that predict GDM, including scoring methods, glucose biochemistry assays, and glycosylated hemoglobin (HbA1c) levels have all been used in different populations with varying degrees of success (Meertens et al. 2019; Olmedo et al. 2017). According to a clinical study, GDM can be prevented if a comprehensive lifestyle change is introduced before the 20th week of pregnancy (Shen et al. 2016; Ramos et al. 2019).

Unless the preciously developed models for GDM perform well, all of them neglect the explainability issue, focusing instead on enhancing the performance of the ML model. Therefore, most of them are not accepted in the medical field. The importance of explainability in ML applications has grown since it helps toward providing a transparent model that could explain the output decision. Traditional models that deal with hundreds of variables struggle to understand the impact of each feature on the overall decision, and the features that make the developed model shift toward one of the classes. Another important issue is the ongoing explainability of models, which is used to determine the variable importance and the effect of changes in the developed model. Our goal was to develop a clinical diabetes risk prediction model for women who have already been diagnosed with GDM.

The prediction model is based on the vital signs obtained from a set of sensors connected to the woman. And when we talk about sensors sending data, it leads to talking about IoT (Talaat et al. 2019). IoT generates a huge amount of data which must be sent to cloud-based Data Centers. To minimize the latency, which is a key concern in such cases like healthcare (Atlam 2018), Fog Computing (FC) is a mandatory decision. The FC is not a replacement for Cloud Computing, but rather an extension of it that makes use of resources from edge devices (Bonomi et al. 2012). Hence, the FC increases QoS parameters such as bandwidth efficiency and energy consumption while also lowering the latency (Talaat et al. 2020).

The originality of this research is that it provides a comprehensive framework for monitoring pregnant women. The proposed Data Replacement and Prediction Framework (DRPF) is divided into three layers: (i) IoT, (ii) Fog, and (iii) Cloud. The first layer used IoT sensors to obtain vital indicators from pregnant women using invasive and non-invasive sensors. The vital indicators are transmitted to fog nodes to be processed and finally stored in the cloud layer. The key contribution in this research is in the fog layer producing the GDM module to perform two influential tasks: (i) Data Finding Methodology (DFM), and (ii) Explainable Prediction Algorithm (EPM) using DNN. First, the DFM is used to replace the unused data to free up the cache space for new incoming data items. The cache replacement is very important in the healthcare system since incoming vital signs are frequent and must be replaced continuously. Second, the EPM is used to predict the occurrence of GDM that may occur in the second trimester of the pregnancy.

2 Background and basic concepts

This section introduces some concepts in the field of Probabilistic Neural Networks (PNN), fog in healthcare applications, and data caching in fog.

2.1 Probabilistic neural networks (PNN)

A probabilistic neural network (PNN) is a type of feed forward neural network that is commonly used to solve classification and pattern recognition tasks. A Parzen window and a non-parametric function are used to approximate the parent probability distribution function (PDF) of each class in the PNN method. PNN is organized into a four-layer multilayered feed-forward network (Karthikeyan 2008; Venkatesh and Gopal 2011): (i) Input layer: is made up of nodes that each have a collection of metrics. (ii) Pattern layer: Each example in the training data set has its own neuron. It calculates the test case's Euclidean distance from the neuron's center point, then uses the sigma values to apply the Radial Basis Function (RBF) kernel function. (iii) Summation layer: For each class, executes a sum operation on the outputs from such as the highest-scoring label node. PNN has a number of advantages, including: (Karthikeyan 2008): (1) PNN networks predict target probability scores with high accuracy, and (2) As the size of the representative training set grows, it is guaranteed to converge to an optimal classifier second layer. (iv) Output layer: takes all of the summation nodes' outputs and outputs the maximum.

PNNs are a scalable alternative to classic back-propagation neural networks in classification and pattern recognition applications. They don't require the massive forward and backward calculations that regular neural networks do. They can also deal with a variety of training data. These networks leverage the concept of probability theory to reduce misclassifications when applied to a classification problem.

2.2 Explainability and interpretability of deep learning models

Explainable AI (XAI) is a framework that is used to open the black box of machine learning and help in understanding the output of the machine learning models (Vellido 2019). Explainability is also defined as the degree to which humans could understand the ML decision (Zheng et al. 2019), provide insights on the ML model, and discuss the logic behind this decision. Applying XAI provides three main advantages include (1) provides a clear explanation and boosts trust in the developed model. (2) Enables model troubleshooting (3) specifies the source of the model basis. Explainability and accuracy are considered two separate issues that should be maintained when building ML models. Generally, algorithms with high accuracy performance cannot provide a coherent explanation for their decisions and vice versa. The two main types of AI explainability include a global method that is used to understand the overall behavior of the model and the effect of each feature in the output decision and a local method is used to clarify the decision of the model for each instance (El-Sappagh et al. 2021). The interpretable model is very critical, especially in the medical domains, to translating the output decision into human-understandable language.

2.3 Fogs in healthcare applications

Healthcare services and applications are delay-sensitive. They deal with the private data of the patients (Aazam et al. 2015). The patients' data contain very sensitive and personal information, so the data location must be secured. High latency may cause many problems in tele-health and telemedicine applications, which makes FC a suitable paradigm in healthcare applications (Khan et al. 2021; Ahmadi et al. 2021). A simple sensor-to-cloud architecture is impractical for many health informatics applications. Regulations prohibit the storage of patient data outside a hospital in specific instances(Quy et al. 2021; Gupta and Dhurandher 2021). Because of patient safety concerns in the event of network and data center failures, reliance exclusively on remote data centers is also unsuitable for some applications (Habibi et al. 2020). FC is one possibility for bridging the gap between sensors and analytics in health informatics.

2.4 Data caching in fog

Reduced latency is a critical issue in the fog computing paradigm as the number of time-sensitive applications grows(Gupta and Dhurandher 2021; Shahid et al. 2020). As a result, one of the goals of an effective IoT application is to reduce fog computing latency(Khan et al. 2021; Unger et al. 2019). This approach uses popularity-based caching to achieve this goal, with a strong emphasis on the users' interests.

Data caching is a crucial topic in FC for boosting data availability and decreasing access latency. Because each Fog Node (FN) is so small, cache replenishment is a key concern. Cache replacement achieves load balancing in the FC context by ensuring data availability. The most common data caching approach in FC is cooperative caching. In this case, each FN's local cache is shared with its neighbors, resulting in a large unified cache. Each node in a cooperative caching system can obtain data not only from its local cache but also from the caches of its neighbors(Kraemer et al. 2020; Quy et al. 2021; Lloret 2019).

As a result, the data availability is maximized, the access delay is minimized, and the response time for the end-user layer is reduced. FNs share data in various fog applications, including healthcare (Ali et al. 2020), smart homes (Elhayatmy et al. 2018), industrial systems (Pop et al. 2020; Shahid et al. 2020), and intelligent traffic signals (Rahul and Aron 2021). As a result, sharing cache content among FNs has numerous advantages. The cache replacement method reduces response time by selecting a suitable set of data for caching. When the cache fills up, a data item must be removed to create room for the data that must be fetched (El-Rashidy et al. 2021b, 2020). The performance will improve if the least used data object is used.

3 Related work

3.1 Utilizing fog computing in healthcare systems

Real-time monitoring is required for healthcare applications. The cloud cannot meet real-time requirements (Verma and Sood 2018; Khaloufi et al. 2020). This cloud is ineffective for latency-sensitive applications. FC has been presented as a solution to these problems. Ahmad et al. (2016) proposed a health fog system in which FC serves as an intermediary layer between the cloud and the end-user. With this three-layer architecture, communication expenses are reduced.

Dilibal et al. (2020) proposed a smart FC architecture to reduce network latency and traffic. In this three-layer design, requests can be processed locally before being transmitted to the cloud. FC serves as an intermediary layer that improves network services while reducing the downsides of IoT health. Fog nodes are used in the healthcare IoT, to reduce latency (Khan et al. 2021). Greco et al. (Yi et al. 2015; Gupta and Dhurandher 2021) proposed a layered architecture aimed at addressing health monitoring issues. There are two types of health monitoring problems: static and dynamic monitoring.

Alli et al. (2020) proposed an IoT-Fog-Cloud ecosystem. It is an intriguing architecture in which IoT devices respond to user requests. The end devices are at the bottom, the fog layer is in the middle, and the cloud layer is at the top. This architecture supports localized computation, fog-edge computing, and remote computing. Abdelmoneem et al. (2021) described a system that dynamically distributes healthcare tasks across cloud and FC. This architecture will handle a wide range of health conditions and a large number of individuals.

3.2 Utilizing deep learning in predicting GDM

Predicting pregnancy deterioration is considered a critical issue in the medical field. Several researchers have recently used ML and DL to predict GDM and its consequence. For example, Wang (2021) predicts GDM using various ML algorithms including random forest (RF), SVM, and artificial neural network (ANN). The model, which was tested using data collected from different hospitals in Eastern China, resulted in inaccuracies ranging from 81 to 86%. Another study (Qiu et al. 2017) used patient EHR to predict GDM during early pregnancy based. The authors first employed six ML to improve accuracy (SVM, NN, logistic regression (LR), Bayesian network, and CHAID tree) and then developed a cost-effective hybrid model. The accuracy for training and testing was 86.5% and 84.7%, respectively. Similarly, Sumathi (Sumathi and Meganathan 2022) proposes a voting ensemble classifier based on multiple ML techniques such as (LR, SVM, RF, and k-nearest neighbor) that results in an accuracy of 94.24%.

Y. Liu et al. first studied the impact of several types of features and multiclass feature combinations on predicting GDM (Ali et al. 2020). They developed a feature screening method to automatically filter the appropriate number of features based on the importance of traits. Then, they vectorized features using depth representation methods like network embedding, analyzed the relationship between features using a similarity measurement method, and finally applied it to the classification model for prediction. This approach could automatically learn some aspects based on both domain knowledge rather than artificial rules, resulting in superior results, unless the enhanced performance of the developed model necessitates extra time and money to manage data by humans.

When the features are filtered by Wideband Bandpass Filters as in Elhayatmy et al. (2018), the accuracy, F1 value, and AUC value of LR are 0.809, 0.881, and 0.825, respectively, representing a 12% gain over when the feature is not used. The findings showed that a data drive based on electronic medical records can significantly improve the accuracy of predicting gestational diabetes. Zhong et al. (Pop et al. 2020) developed a method to assess the risk of GDM in second-trimester pregnancy. This model, which is based on several risk factors, has a high predictive value for developing GDM in pregnant Chinese women and may be useful in directing future clinical practice. However, there was no significant difference in terms of liver function between the two groups, which is an important indicator of visceral fat metabolism (especially hepatic fat metabolism).

Schwartz et al. (2021) developed and internally verified a therapeutically effective prediction model for women with past GDM, that includes fasting glucose, HbA1c, BMI, treatment arm, and BMI by treatment arm interaction. Integrating personalized diabetes risk prediction into pre-diabetes therapy decision-making should help researchers better understand the benefits of ILI and/or metformin in diabetes prevention. A clinical prediction model was devised for personalized decision-making in the management of pre-diabetes in women with past GDM. For women with a prior GDM, the estimated incidence of diabetes without therapy was 37.4%, compared with 20.0% with comprehensive lifestyle modification or metformin treatment. It is officially predicated on the presumption of a lady who has previously undergone a GDM. In most circumstances, it is not very accurate. Guo et al. developed a simple nomogram for pregnant Chinese women (Guo et al. 2020), which can be used to predict the likelihood of developing GDM during the first antenatal visit. This method can detect GDM early, allowing for more effective management to improve maternal outcomes. On the other hand, the AUC statistic is concerned just with prediction accuracy. A model with a higher AUC but a little lower sensitivity might be a better choice for clinical application. As a result, we used decision-analytic approaches based on our findings and theory to assess the worthiness of a model or other alternatives.

4 The proposed data replacement and prediction framework (DRPF)

One of the most significant applications related to the goals of IoT is an efficient healthcare system. In this regard, many factors should be taken into consideration, such as time, the privacy of data, and accuracy. The healthcare system should be reliable and accessible at all times. Accordingly, this research is concerned with designing an IoT-Fog-based healthcare system, as shown in Fig. 1. The proposed DRPF consists of three layers, which are: (i) IoT Layer, (ii) Fog Layer, and (iii) Cloud Layer. The IoT layer combines the IoT devices (pulse oximeter, ECG monitor, etc.) to observe the user status. The fog layer is responsible for handling the incoming requests and forwards them to the appropriate FN. The fog layer is divided into a set of fog regions, and layer 3 is the cloud datacenters. The roles of the proposed layers are detailed in the following subsections.

4.1 IoT layer

IoT devices are used because they provide a wide range of flexibility, for example, if a patient requires constant care, he or she can remain at home rather than in a hospital and be monitored frequently using IoT technology. The data transferred from the sensor to the control device and then to the monitoring center are affected by noise, impairing the data quality. Monitoring a large number of users on the IoT demands more storage and infrastructure, which can be avoided by storing data in the cloud.

4.2 Cloud layer

Cloud data centers are located at a remote distance away from IoT devices, which leads to high latency. This issue negatively affects the response time for real-time applications such as critical health monitoring systems, traffic monitoring, and emergency fire. Furthermore, IoT sources are geographically extended and can generate a large volume of data sent to the cloud for processing, which results in overloading. The edge computational resources can address the previously described challenges in IoT systems.

The patient data generated from the IoT sensors are delivered to the user application that uses the proposed GDM module. The used application sends its data to be processed in the fog layer. The main module called GDM Module is implemented and runs in the fog layer, as shown in Fig. 2. The GDM module is used to predict GDM with low latency.

4.3 Fog layer

Fog can be considered a computing paradigm that performs IoT applications at the network’s edge. The Fog improves the QoS metrics such as (bandwidth efficiency and energy consumption) and reduces latency(Ghosh et al. 2020). The main mission of fog is to deliver data and bring it closer to the user.

4.3.1 The proposed GDM module

The proposed GDM module is composed of two main sub-modules: (i) Data Finding Methodology (DFM), and (ii) Explainable Prediction Algorithm (EPM) using DNN.

4.3.1.1 Data finding methodology (DFM)

The DFM is used to replace the unused data to free up the cache space for the new incoming data items. The cache replacement is very important in the case of the healthcare system since the incoming vital signs are frequent and must be replaced continuously. Caching in a fog environment is constrained by bandwidth limitations, power limitations, and cache space limitations. A good replacement mechanism is necessary to discriminate between data items that should be preserved in the cache and those that should be discarded when the cache is full.

The network is divided into fog regions and each region has a Master Node (MN) that manages the communication in each fog region. The MN collects the required features of each FN, such as (i) Existing Data (ED), (ii) Time-To-Live (TTL), and (ii) cache size. The MN periodically checks each data feature to delete the data items with zero TTL. If the Fog cache’s server is full and there is incoming data, the MN can decide to delete a data item according to some criteria. As shown in Table 1, each FN has a table called Data Cache Table, which contains information about each di in its cache memory such as: (data item (di), Access Time (TA), Size of data (S), Access Frequency (FA), Access Count (AC), TTL, and Cache Free Size (CFS).

Table 1 Data Cache Table (DCT)

Full size table

The DFM leads to periodically update the cache and decrease the latency. The suggestive measures have been taken into account to measure the performance of the cashing schemes are: (i) Hit Ratio (HR), (ii) access latency, and (iii) power consumption. The access latency is defined as the average packet delay over a multi-hop route. It is used as a measure of the accessibility of the nodes.

The algorithm can use PNN to decide to remove a data item and replace it with new incoming data according to its features. The inputs to the PNN are TA, AC, and FA. The output of PNN is Data Replace (DR). DR can be either Yes or No. The steps of PNN-based cache replacement strategy are shown in Algorithm 1.

Using PNN, algorithm can decide to remove a data item and replace it with new incoming data according to its features. The input to the PNN is: TA, AC, and FA. The output of PNN is Data Replace (DR). DR can be Yes or No. The steps of PNN-based cache replacement strategy are shown in Algorithm 1.

4.3.1.2 Explainable prediction algorithm (EPM) using DNN

This section proposed an EPA model that detected the prevalence of GDM among pregnancies. Furthermore, to provide an understandable explanation of the predicted output, we evaluated our model based on the MIMIC III dataset (Alistair et al. 2016; Saeed et al. 2002) as shown in Fig. 3. The proposed EPM consists of four main steps: (a) Data Collection: collecting the required dataset using PostgreSQL, extracting data from various tables such as (patients, chart events, D_itmes, lab-events, and input_events), (b) Data Preprocessing: The output from the first step is cleaned and preprocessed using various steps including (removing outliers, standardization, and balancing), (c) Feature Extraction: using DNN to develop a classification model that could detect the occurrence of GDM. (d) Developing DL model: The output decision then used SHAP explainer to provide an understandable explanation of the developed decision. The performance of our model was evaluated using unseen data to ensure that the efficiency of the proposed model is promising, accurate, and explainable.

Data collection Medical Information Mart for Intensive Care III (MIMIC III) is a benchmark dataset developed by MIT Lab. It includes HER data for patients inside the intensive care unit. MIMIC III is accessible by obtaining confirmation from Physionet Organization. MIMIC III includes the data for 53.422 distinct patients. 4750 measurements and 390 laboratory tests were included in the MIMIC III dataset. As shown in Fig. 3, we extract data from the MIMIC III dataset in this research, including patient’s demographics (i.e., age, gender, BMI), vital signs (i.e., heart rate, respiratory rate, glucose level, etc.), and laboratory tests (i.e., Albumin, Creatine, Cholesterol, sodium, etc.).

The present study was conducted on 8740 pregnant women according to inclusion criteria including: (i) female gender that was adult (age > 20). (ii) Recorded as pregnant in mimic iii database (item_id (pregnant = 225,082, pregnant due date = 225,083). Gestational age between 6 and 26 weeks. Existing of required vital signs and laboratory tests. Features used in EPM are detailed in Table 2.

Table 2 Features used in EPM

Full size table

The output from the first step is cleaned and preprocessed using different steps including removing outliers, standardization, and balancing (Li et al. 2010). The steps of data preprocessing are as follows: (i) Data balancing: Class imbalance is a common problem, especially with the medical dataset. In MIMIC III, a minor number of pregnant women have GDM, which may lead to the problem of an imbalanced dataset. Two main techniques commonly used to handle this issue include oversampling (Mao et al. 2019) and under-sampling (Kaur and Gosain 2018). Oversampling techniques are used to increase the number of samples in the minority class, such as the synthetic minority oversampling technique, whereas under-sampling techniques such as Tomek link and random under-sampling are used to remove samples from the majority class. In this study, we used the random under-sampling technique to keep the data balance. The main advantage of using the under-sampling technique is that it does not introduce noise into the dataset. (ii) Handle missing values: The MIMIC dataset has approximately 15–20% of missing data. Several statistical techniques have been used to impute the missing values, such as expectation maximization (Moon 1996), hot decking encoding (Joenssen and Bankhofer 2012), etc. In this study, we removed data with more than 50% missing data. We only selected patients who had at least one record for each vital sign per day. Then, the forward and backward filling are used to fill the patient’s data. (iii) Scaling data: The extracted features have different values, which may vary in their value. These variations usually affect classifier performance. Therefore, we scaled all features to a range from 0 to − 1 using Minmax scales (Rachkidi 2015; Jäger et al. 2021).

Feature extraction In this section, we extracted two feature subsets A, and B as shown in Table 3. The feature set A: included the main vital signs (heart rate, glucose level. SPo2, blood pressure, etc.), and some laboratory tests include (PCT, total bilirubin, etc.). The feature set B: included all features in feature A, and other pregnancy-related features such as Gestational age, weight change, and other laboratory features such as Lymphocyte, Sodium, Vitamin E, Neutrophil, etc. These features have a critical effect on GDM detection. For example, Vitamin E is a critical measure to maintain the metabolism of the body and scavenging radical activities. The deficiency of Vitamin E among pregnancies may lead to vascular endothelial, the incidence of GDM, and hypertension, in addition to placental and premature birth (Kraemer et al. 2020). Therefore, considering vitamin E is important in GDM prediction. The same is true for lymphocytes, the count decreases during the first and the second trimesters and increases during the third. Increased lymphocytes could potentially contribute to irregular glucose levels.

Table 3 Features used in model A and model B

Full size table

Developing DL model The Dl model includes 20 input dimensions using dense and dropout layers. Dense layers are considered neural networks that are deeply connected. Each neuron in each layer receives an output from the previous layers. Dense layers are also used to change the vector dimension. A dropout layer is a regularization technique that is used to avoid overfitting by randomly ignoring some neurons during the training process (Smieja et al. 2018). As shown in Fig. 4, in the hidden layers, we used the activation function rectified linear activation function or “ReLU,” it is a linear activation function that produces the input directly if it is positive, otherwise, it will produce zero. In the last layer, we used the sigmoid activation function for binary classification (Khan et al. 2021). This produces a robust network with a good generalization ability and minimum likelihood to overfit. (Khan et al. 2021).

5 Results

To predict GDM, we used the basic feature set in model A, such as the patient’s age, heart rate, blood pressure, and other vital signs. EPM achieves adequate performance (Accuracy = 0.902%, AUC = 0.912%). Model B used the same features as model A, in addition to other features including gestational age, weight change, and other laboratory tests such as Albumin, and vitamin E, which have an impact on GDM incidence. The results demonstrate that the performance increased when adding weight change and gestational change (Accuracy = 0.957%, AUC = 0.942%). From the previous experiments, we observed the following: (i) patients with GDM ranged from 25 to 45 years (average 32.12 ± 5.6). (ii) GDM usually appeared in gestational age between (19 and 26 weeks). (iii) Both BMI and weight change during pregnancy is highly associated with GDM (GDM average for BMI was 28 ± 6.2 and for non-GDM was 21.66 ± 3.2). (iv) GDM in pregnant women was associated with a significant difference (P < 0.05) in liver and kidney functions that reflected in high values for Albumin, BUN, SBP, TC, etc. The overall results are illustrated in Table 4 and Fig. 5.

Table 4 Features used in model A and model B

Full size table

5.1 Statistical analysis

To ensure the superiority of the developed DNN model, both model a and b are compared using Freidman test (Dem 2006). Freidman test is non parametric test that is used to determine if there is significant difference between models. In order to choose the best performance model according to statistical test, the average rank for each model is calculated based on the Nemenyi test (Friedman 1990). Results of the Nemenyi test could be visualized using the critical difference diagram. Figure 6 shows a comparison between classification models based on the critical difference calculated based on the results of the Nemenyi test for all models. The test shows a significance difference between the developed models (Statistics = 9.855, \(P<0.005\)). Figure 7 shows that model B give the improved performance over model A (i.e., AUC = 0.942, \(P<0.005\)) followed by the same feature set after.

5.2 Evaluation of explainability of DL model

5.2.1 Global explainability

In this section, we use SHAP summary plots to show the behavior of the developed model in terms of different values of several features. Figure 7 shows one feature per horizontal line and the number of dots represents the correlation between the feature and the overall decision. The color of the dots represents the correlation (Red for high, and blue for low). We can draw the following observations from Fig. 7. (i) Albumin and weight change correlate significantly with GDM prediction, and higher values have a positive impact on predicting GDM. (ii) High PTT values have a negative effect on GDM prediction. (iii) The summary plot allows the specification of the effect of the outliers. For example, while the weight change is not the most important feature, it does have an impact in some circumstances. This appeared in the extended tail that was distrusted in both directions.

5.2.2 Local explainability

In this section, we use SHAP plots to explain the output decision in each case (local explainability). Figure 8 shows a GDM case with probability of 73% having GDM. It also shows the most influential feature values that move the result toward the positive class such as gestational age = 19, albumin = 56.75, neutrophil = 96.47, and other factors that move the decision not to have GDM, including weight change = 2.31.

The all-previous mentioned abbreviations are listed as shown in Table 5.

Table 5 List of abbreviations

Full size table

5.3 The performance metrics for DFM

The common performance metrics which are used to measure the performance of the cashing schemes are: (i) Hit Ratio (HR), (ii) access latency, and (iii) power consumption. Table 6 summarizes the definitions of the performance metrics.

Table 6 The Performance metrics to evaluate the proposed DSP scheme

Full size table

Assume the four data items located at the DCT have the parameters values shown in Table 7. And a new incoming data item (di_new) needs to be located at the DCT. The size of di_new is 0.204 MB.

Table 7 Four data items parameters located at DCT

Full size table

The performance of DFM comparing with the top state-of-the-art caching strategy is shown in Table 8.

Table 8 Comparing of DFM with the top state-of-the-art caching strategy

Full size table

From Table 7, it is shown that DFM has achieved the highest HR, the lowest access latency, and the lowest power consumption due to the high accuracy of using the PNN.

6 Conclusion

This research provided a comprehensive framework for monitoring pregnant women. The proposed DRPF consists of three layers, which are: (i) IoT Layer, (ii) Fog Layer, and (iii) Cloud Layer. The first layer used IoT sensors to aggregate vital signs from pregnancies using invasive and non-invasive sensors. Vital signs are transmitted to fog nodes for processing and finally stored in the cloud layer. The main contribution of this research is in the fog layer producing GDM module to implement two influential tasks which are: (i) DFM, and (ii) Explainable Prediction Algorithm (EPM) using DNN. First, the DFM is used to replace the unused data to free the cache space for the new incoming data items. The cache replacement is very important in the case of the healthcare system as the incoming vital signs are frequent and must be replaced continuously. Second, the EPM is used to predict the occurrence of GDM that may occur in the second trimester of pregnancy. The first DL model (model A) is based on vital signs, laboratory tests, and patient demographics. The second DL model (model B) used the same features, in addition to other pregnant features including weight change, gestational age, Lymphocyte, Sodium, Vitamin E, Neutrophil, etc. Utilizing Fog computing provides reduced latency when compared to cloud computing due to the use of only low-end computers, mobile phones, and personal devices in fog computing. The proposed system monitors the preganant’s vital signs i ( body temperature, heart rate, and blood pressure values, etc.) that obtained from the sensors that are embedded into a wearable device and notifies the doctors or caregivers in real time if there occur any contradictions in the normal threshold value using the machine learning algorithms. The notification can also be set for the patients to alert them about the periodical medications or diet to be maintained by the patients. The cloud layer stores the big data into the cloud for future references for the hospitals and the researchers. Our study findings reported that patients’ age, BMI, blood pressure, and Lymphocyte vitamin E are mainly associated with GDM diagnosing. The proposed model achieves accurate and promising results from an academic perspective. However, we still need to close from real-world scenarios. Therefore, in the future, we intend to apply our model on a large scale of pregnant patients to ensure the generalization ability of our research.

Data availability

Enquiries about data availability should be directed to the authors.

Change history

10 August 2023
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s00500-023-09090-z

References

Aazam M, Hung PP, Huh EN (2014) Smart gateway based communication for cloud of things. IEEE Ninth Int Conf Intell Sensors Sensor Netw Inf Process. https://doi.org/10.1109/ISSNIP.2014.6827673
Article Google Scholar
Abdelmoneem RM, Benslimane A, Shaaban E (2020) Mobility-aware task scheduling in cloud-fog IoT-based healthcare architectures. Comput Netw 179:107348
Article Google Scholar
Adams RP et al (2015) A physiological time series dynamics-based approach to patient monitoring and outcome prediction. IEEE J Biomed Heal Inform 19(3):1068–1076. https://doi.org/10.1109/JBHI.2014.2330827.A
Article Google Scholar
Ahmad M, Bilal M, Hussain S, Ho B, Cheong T, Lee S (2016) Health fog: a novel framework for health and wellness applications. J Supercomput 72(10):3677–3695. https://doi.org/10.1007/s11227-016-1634-x
Article Google Scholar
Ahmadi M, Mirbagheri E (2019) Designing data elements and minimum data set (MDS) for creating the registry of patients with gestational diabetes mellitus. J Med Life 12(2):160–167. https://doi.org/10.25122/jml-2019-0011
Article Google Scholar
Ahmadi Z, Haghi M, Nikravan M (2021) Fog - based healthcare systems: a systematic review. Springer, US
Google Scholar
Ali SH, Saleh AI, Ali HA (2020) Effective cache replacement strategy (ECRS) for real-time fog computing environment. Cluster Comput 23(4):3309–3333. https://doi.org/10.1007/s10586-020-03089-z
Article MathSciNet Google Scholar
Alistair LS, Johnson EW, Pollard TJ (2016) Data descriptor: MIMIC-III a freely accessible critical care database. Thromb Haemost. 76(2):258–262. https://doi.org/10.1038/sdata.2016.35
Article Google Scholar
Aroda VR et al (2015) The effect of lifestyle intervention and metformin on preventing or delaying diabetes among women with and without gestational diabetes: the Diabetes Prevention Program outcomes study 10-year follow-up. J Clin Endocrinol Metab 100:1646–1653. https://doi.org/10.1210/jc.2014-3761
Article Google Scholar
Atlam HF (2018) Fog computing and the internet of things: a review. Big Data Cognit Comput 2:1–18. https://doi.org/10.3390/bdcc2020010
Article Google Scholar
Awad A, Bader-El-Den M, McNicholas J, Briggs J (2017) Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach. Int J Med Inform 108:185–195. https://doi.org/10.1016/j.ijmedinf.2017.10.002
Article Google Scholar
Bonomi F, Milito R, Zhu J, Addepalli S (2012) Fog computing and its role in the internet of things. Proc First Edn MCC Workshop Mobile Cloud Comput. https://doi.org/10.1145/2342509.2342513
Article Google Scholar
Burlina S, Dalfrà MG, Chilelli NC, Lapolla A (2016) Gestational diabetes mellitus and future cardiovascular risk: an update. Int J Endocrinol 2016:1–6
Article Google Scholar
Care D (2018) Older adults: standards of medical care in diabetes-2018. Diab Care 41:S119
Article Google Scholar
Costa B, Barrio F, Cabré JJ, Piñol JL, Cos X, Solé C, Bolíbar B, Basora J, Castell C, Solà-Morales O, Salas-Salvadó J (2012) Delaying progression to type 2 diabetes among high-risk Spanish individuals is feasible in real-life primary healthcare settings using intensive lifestyle intervention. Diabetologia 55(5):1319–1328. https://doi.org/10.1007/s00125-012-2492-6
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. The J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Dey N, Hassanien AE, Bhatt C, Ashour A, Satapathy SC (eds) (2018) Internet of things and big data analytics toward next-generation intelligence, vol 35. Springer, Berlin. https://doi.org/10.1007/978-3-319-60435-0
Book Google Scholar
Dilibal Ç (2020) Development of edge-IoMT computing architecture for smart healthcare monitoring platform. Int Sympos Multidiscipl Studies Innov Technol. https://doi.org/10.1109/ISMSIT50672.2020.9254501
Article Google Scholar
Egan AM, Enninga EAL, Alrahmani L, Weaver AL, Sarras MP, Ruano R (2021) Recurrent gestational diabetes mellitus: a narrative review and single-center experience. J Clin Med 10(4):569
Article Google Scholar
Ekelund M, Shaat N, Almgren P, Groop L, Berntorp K (2010) Prediction of postpartum diabetes in women with gestational diabetes mellitus. Diabetologia 53(3):452–457. https://doi.org/10.1007/s00125-009-1621-3
Article Google Scholar
El-Rashidy N, El-Sappagh S, Abuhmed T, Abdelrazek S, El-Bakry HM (2020) Intensive care unit mortality prediction: an improved patient-specific stacking ensemble model. IEEE Access 8:133541–133564. https://doi.org/10.1109/ACCESS.2020.3010556
Article Google Scholar
El-Rashidy N, El-Sappagh S, Islam SMR, El-Bakry HM, Abdelrazek S (2020) End-to-end deep learning framework for coronavirus (COVID-19) detection and monitoring. Electron 9(9):1–25. https://doi.org/10.3390/electronics9091439
Article Google Scholar
El-Rashidy N et al (2021a) Sepsis prediction in intensive care unit based on genetic feature optimization and stacked deep ensemble learning, vol 1. Springer, London
Google Scholar
El-Rashidy N, El-Sappagh S, Islam SMR, El-Bakry HM, Abdelrazek S (2021b) Mobile health in remote patient monitoring for chronic diseases: principles, trends, and challenges. Diagnostics 11(4):607. https://doi.org/10.3390/diagnostics11040607
Article Google Scholar
El-Sappagh S, Alonso JM, Islam SMR, Sultan AM, Kwak KS (2021) A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease. Sci Rep 11(1):1–27. https://doi.org/10.1038/s41598-021-82098-3
Article Google Scholar
Forkan AR, Khalil I (2016) A probabilistic model for early prediction of abnormal clinical events using vital sign correlations in home-based monitoring. In: 2016 IEEE international conference on pervasive computing and communications (PerCom) doi:https://doi.org/10.1109/PERCOM.2016.7456519.
Friedman M (1990) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701. https://doi.org/10.1080/01621459.1937.10503522
Article MATH Google Scholar
García-Magariño I, Varela-Aldas J, Palacios-Navarro G, Lloret J (2019) Fog computing for assisting and tracking elder patients with neurodegenerative diseases. Peer-to-Peer Netw Appl 12(5):1225–1235
Article Google Scholar
Ghosh BC, Addya SK, Somy NB, Nath SB, Chakraborty S, Ghosh SK (2020) Caching techniques to improve latency in serverless architectures. Int Conf COMmun Syst NETworkS. https://doi.org/10.1109/COMSNETS48256.2020.9027427
Article Google Scholar
Gracia VD, Olmedo J (2017) Diabetes gestacional: conceptos actuales. Ginecología y obstetricia de México 85(6):380–390
Google Scholar
Guo F, Yang S, Zhang Y, Yang X, Zhang C, Fan J (2020) Nomogram for prediction of gestational diabetes mellitus in urban, Chinese, pregnant women. BMC Pregnancy Childbirth 20(1):1–8
Article Google Scholar
Gupta N, Dhurandher SK (2021) Efficient caching method in fog computing for internet of everything. Peer-to-Peer Netw Appl 40(1):439–452
Habibi P, Farhoudi M, Kazemian S, Khorsandi S, Leon-garcia A (2020) Fog computing: a comprehensive architectural survey. IEEE Access 8:69105–69133. https://doi.org/10.1109/ACCESS.2020.2983253
Article Google Scholar
Herman WH et al (2017) Impact of lifestyle and metformin interventions on the risk of progression to diabetes and regression to normal glucose regulation in overweight or obese people with impaired glucose regulation. Diabetes Care 40(12):1668–1677. https://doi.org/10.2337/dc17-1116
Article Google Scholar
Ignell C, Ekelund M, Anderberg E, Berntorp K (2016) Model for individual prediction of diabetes up to 5 years after gestational diabetes mellitus. Springerplus. https://doi.org/10.1186/s40064-016-1953-7
Article Google Scholar
Intervention L, Metformin OR (2006) NIH Public Access. 346(6): 393–403
Jäger S, Allhorn A, Bießmann F (2021) A benchmark for data imputation methods. Big Data 4:1–16. https://doi.org/10.3389/fdata.2021.693674
Article Google Scholar
Joenssen DW, Bankhofer U (2012) Hot deck methods for imputing missing data. International workshop on machine learning and data mining in pattern recognition. Springer, Berlin, Heidelberg, pp 63–75. https://doi.org/10.1007/0097836.4231.53746
Chapter Google Scholar
Karthikeyan B (2008) Partial discharge pattern classification using composite versions of probabilistic neural network inference engine. Expert Syst Appl 34:1938–1947. https://doi.org/10.1016/j.eswa.2007.02.005
Article Google Scholar
Kaur P, Gosain A (2018) Comparing the behavior of oversampling and undersampling approach of class imbalance learning by combining class imbalance problem with noise. ICT based innovations. Springer, Singapore, pp 23–30. https://doi.org/10.1007/978-981-10-6602-3
Chapter Google Scholar
Khaloufi H, Abouelmehdi K, Beni-Hssane A (2020) Fog computing for smart healthcare data analytics: an urgent necessity. Proc Int Conf Netw Inf Syst Secur. https://doi.org/10.1145/3386723.3387861
Article Google Scholar
Khan OA et al (2021) A cache-based approach toward improved scheduling in fog computing. Softw Pract Exp 51(12):2360–2372. https://doi.org/10.1002/spe.2824
Article Google Scholar
Khan IU et al (2021) Computational intelligence-based model for mortality rate prediction in COVID-19 patients. Int J Environ Res Public Health 18(12):6429. https://doi.org/10.3390/ijerph18126429
Article Google Scholar
Kraemer FA, Braten AE, Tamkittikhun N, Palma D (2020) Fog computing in healthcare—a review and discussion. IEEE Access 5:9206–9222. https://doi.org/10.1109/ACCESS.2017.2704100
Article Google Scholar
Li D-C, Liu C-W, Hu SC (2010) A learning method for the class imbalance problem with medical data sets. Comput Biol Med 40(5):509–518. https://doi.org/10.1016/j.compbiomed.2010.03.005
Article Google Scholar
Lindstrom J et al (2008) Finnish Diabetes Prevention Study (DPS) Group. Determinants for the effectiveness of lifestyle intervention in the Finnish Diabetes Prevention Study. Diabetes care 31(5):857–862. https://doi.org/10.2337/dc07-2162
Article Google Scholar
Man B, Schwartz A, Pugach O, Xia Y, Gerber B (2021) A clinical diabetes risk prediction model for prediabetic women with prior gestational diabetes. PloS ONE 16(6):e0252501. https://doi.org/10.1371/journal.pone.0252501
Article Google Scholar
Mao W, Liu Y, Ding L, Li Y (2019) Imbalanced fault diagnosis of rolling bearing based on generative adversarial network: a comparative study. IEEE Access 7:9515–9530. https://doi.org/10.1109/ACCESS.2018.2890693
Article Google Scholar
Meertens LJE et al (2020) “External validation and clinical utility of prognostic prediction models for gestational diabetes mellitus: a prospective cohort study. Acta Obstet Gynecol Scand 2019(99):891–900. https://doi.org/10.1111/aogs.13811
Article Google Scholar
Moon TK (1996) The expectation-maximization algorithm. IEEE Signal Process Mag 13(6):47–60. https://doi.org/10.1109/79.543975
Article Google Scholar
Nasralla MM (2021) Sustainable virtual reality patient rehabilitation systems with iot sensors using virtual smart cities. Sustainability 13(9):4716
Article Google Scholar
Noble D, Mathur R, Dent T, Meads C, Greenhalgh T (2011) Risk models and scores for type 2 diabetes: systematic review. Bmj. https://doi.org/10.1136/bmj.d71
Article Google Scholar
Nuzzo AM et al (2021) Placental and maternal sFlt1/PlGF expression in gestational diabetes mellitus. Sci. Rep. 11:1–10. https://doi.org/10.1038/s41598-021-81785-5
Article Google Scholar
Pop P, Zarrin B, Barzegaran M, Schulte S (2020) The FORA fog computing platform for industrial IoT. Inf Syst 98:101727
Article Google Scholar
Qiu H et al (2017) Electronic health record driven prediction for gestational diabetes mellitus in early pregnancy. Sci Rep 7(1):16417. https://doi.org/10.1038/s41598-017-16665-y
Article Google Scholar
Quy VK, Van Hau N, Van Anh D, Anh L (2021) Smart healthcare IoT applications based on fog computing: architecture, applications and challenges. Complex Intell Syst. https://doi.org/10.1007/s40747-021-00582-9
Article Google Scholar
Rachkidi E et al (2015) Towards efficient automatic scaling and adaptive cost-optimized ehealth services in cloud. 2015 IEEE Glob Commun Conf GLOBECOM. https://doi.org/10.1109/GLOCOM.2014.7417751
Article Google Scholar
Rahul S, Aron R (2021) Fog computing architecture, application and resource allocation: a review. CEUR Workshops 4638:0–2
Google Scholar
Ramos G, Borges C, Figueiroa N, Alves LV, Alves JG (2019) Physical activity pattern in early pregnancy and gestational diabetes mellitus risk among low-income women: a prospective cross-sectional study. SAGE Open Med. https://doi.org/10.1177/2050312119875922
Article Google Scholar
Ratner RE, Christophi CA, Metzger BE, Dabelea D, Bennett PH, Pi-Sunyer X, Fowler S, Kahn SE (2008) Diabetes Prevention Program Research Group. Prevention of diabetes in women with a history of gestational diabetes: Effects of metformin and lifestyle interventions. J Clin Endocrinol Metab 93(12):4774–4779. https://doi.org/10.1210/jc.2008-0772
Article Google Scholar
Risk P, Monitoring A, Prams S, Desisto CL, Kim SY, Sharma AJ (2007) Prevalence estimates of gestational diabetes mellitus in the United States, Prevalence Estimates of Gestational Diabetes Mellitus in the United States. Pregnancy Risk Assess Monitor Syst (PRAMS). https://doi.org/10.5888/pcd11.130415
Article Google Scholar
Saeed M, Lieu C, Raber G, Mark RG (2002) MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring. Comput Cardiol 00:641–644. https://doi.org/10.1109/CIC.2002.1166854
Article Google Scholar
Savvidou M, Nelson SM, Makgoba M, Messow C (2010) First-trimester prediction of gestational diabetes mellitus: Examining the potential of combining maternal characteristics and laboratory measures. BMC Pregnancy Childbirth 59:3017–3022. https://doi.org/10.2337/db10-0688.N.S
Article Google Scholar
Shahid MH, Hameed AR, ul Islam S, Khattak HA, Din IU, Rodrigues JJ (2020) Energy and delay efficient fog computing using caching mechanism. Comput Commun. 154:534–41. https://doi.org/10.1016/j.comcom.2020.03.001
Article Google Scholar
Shen H, Liu X, Chen Y, He B, Cheng W (2016) Associations of lipid levels during gestation with hypertensive disorders of pregnancy and gestational diabetes mellitus: a prospective longitudinal cohort study. BMJ Open 6:e013509. https://doi.org/10.1136/bmjopen-2016-013509
Article Google Scholar
Shen J, Chen J, Zheng Z, Zheng J, Liu Z, Song J, Wong SY, Wang X, Huang M, Fang PH, Jiang B (2020) An innovative artificial intelligence–based app for the diagnosis of gestational diabetes mellitus (GDM-AI): Development study. J Med Internet Res 22:1–11. https://doi.org/10.2196/21573
Article Google Scholar
Śmieja M, Struski Ł, Tabor J, Zieliński B, Spurek P (2018) Processing of missing data by neural networks. Adv Neural Inf Process Syst 31:2719–2729
Google Scholar
Sumathi A, Meganathan S (2022) Ensemble classifier technique to predict gestational diabetes mellitus (GDM). Comput Syst Sci Eng 40(1):313–325. https://doi.org/10.32604/CSSE.2022.017484
Article Google Scholar
Sussman JB, Kent DM, Nelson JP, Hayward RA (2015) Improving diabetes prevention with benefit based tailored treatment: risk based reanalysis of Diabetes Prevention Program. Bmj. https://doi.org/10.1136/bmj.h454
Article Google Scholar
Talaat FM, Ali SH, Saleh AI, Ali HA (2019) Effective load balancing strategy (ELBS) for real-time fog computing environment using Fuzzy and probabilistic neural networks. J Netw Syst Manage 27(4):883–929
Article Google Scholar
Talaat FM, Saraya MS, Saleh AI, Ali HA, Ali SH (2020) A load balancing and optimization strategy (LBOS) using reinforcement learning in fog computing environment. J Ambient Intell Humaniz Comput 11(11):4951–4966. https://doi.org/10.1007/s12652-020-01768-8
Article Google Scholar
Unger H et al (2019) The assessment of gestational age: a comparison of different methods from a malaria pregnancy cohort in sub-Saharan Africa 11 Medical and Health Sciences 1114 Paediatrics and Reproductive Medicine. BMC Pregnancy Childbirth 19(1):1–9. https://doi.org/10.1186/s12884-018-2128-z
Article MathSciNet Google Scholar
Vellido A (2019) The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput Appl 32:18069–18083. https://doi.org/10.1007/s00521-019-04051-w
Article Google Scholar
Venkatesh S, Gopal S (2011) Expert systems with applications robust heteroscedastic probabilistic neural network for multiple source partial discharge pattern recognition—significance of outliers on classification capability. Expert Syst Appl 38(9):11501–11514. https://doi.org/10.1016/j.eswa.2011.03.026
Article Google Scholar
Verma P, Sood SK (2018) Fog assisted- IoT enabled patient health monitoring in smart homes. IEEE Internet Things J 5:1789–1796. https://doi.org/10.1109/JIOT.2018.2803201
Article Google Scholar
Wang J et al (2021) Machine learning approaches for early prediction of gestational diabetes mellitus based on prospective cohort study. Res Square. https://doi.org/10.21203/rs.3.rs-508626/v1
Article Google Scholar
Wu Y et al (2021) Early prediction of gestational diabetes mellitus in the Chinese population via advanced machine learning. J Clin Endocrinol Metab 106(3):1191–1205. https://doi.org/10.1210/clinem/dgaa899
Article Google Scholar
Yi S, Li C, Li Q (2015) A survey of fog computing: concepts, applications and issues. 2015 Proc Workshop Mobile Big Data. https://doi.org/10.1145/2757384.2757397
Article Google Scholar
Zhang Y et al (2020) A mid-pregnancy risk prediction model for gestational diabetes mellitus based on the maternal status in combination with ultrasound and serological findings. Exp Ther Med 20(1):293–300. https://doi.org/10.3892/etm.2020.8690
Article Google Scholar
Zheng Q, Delingette H, Ayache N (2019) Explainable cardiac pathology classification on cine MRI with motion characterization by semi-supervised learning of apparent flow. Med Image Anal 56:80–95. https://doi.org/10.1016/j.media.2019.06.001
Article Google Scholar
Zheng T et al (2019a) A simple model to predict risk of gestational diabetes mellitus from 8 to 20 weeks of gestation in Chinese women. BMC Pregnancy Childbirth 8:1–11
Google Scholar
Zheng T et al (2019b) A simple model to predict risk of gestational diabetes mellitus from 8 to 20 weeks of gestation in Chinese women. BMC Pregnancy Childbirth 8:1–10
Google Scholar
Zhu Y, Zhang C (2016) Prevalence of gestational diabetes and risk of progression to type 2 diabetes: a global perspective. Curr Diab Rep. https://doi.org/10.1007/s11892-015-0699-x
Article Google Scholar

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB). The authors have not disclosed any funding.

Author information

Authors and Affiliations

Machine Learning and Information Retrieval department, Faculty of Artificial Intelligence, Kafrelsheiksh University, Kafrelsheiksh, Egypt
Nora El-Rashidy & Fatma M. Talaat
Delta Higher Institute for Management and Accounting Information Systems, Mansoura, Egypt
Nesma E. ElSayed
Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
Amir El-Ghamry

Authors

Nora El-Rashidy
View author publications
You can also search for this author in PubMed Google Scholar
Nesma E. ElSayed
View author publications
You can also search for this author in PubMed Google Scholar
Amir El-Ghamry
View author publications
You can also search for this author in PubMed Google Scholar
Fatma M. Talaat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nora El-Rashidy.

Ethics declarations

Conflict of interest

The authors declared that no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail:https://doi.org/10.1007/s00500-023-09090-z

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

About this article

Cite this article

El-Rashidy, N., ElSayed, N.E., El-Ghamry, A. et al. RETRACTED ARTICLE: Prediction of gestational diabetes based on explainable deep learning and fog computing. Soft Comput 26, 11435–11450 (2022). https://doi.org/10.1007/s00500-022-07420-1

Download citation

Accepted: 29 July 2022
Published: 27 August 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s00500-022-07420-1

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

RETRACTED ARTICLE: Prediction of gestational diabetes based on explainable deep learning and fog computing

Abstract

Similar content being viewed by others

Utilizing fog computing and explainable deep learning techniques for gestational diabetes prediction

Remote Patient Monitoring Using IoT, Cloud Computing and AI

HeartFog: Fog Computing Enabled Ensemble Deep Learning Framework for Automatic Heart Disease Diagnosis

1 Introduction

2 Background and basic concepts

2.1 Probabilistic neural networks (PNN)

2.2 Explainability and interpretability of deep learning models

2.3 Fogs in healthcare applications

2.4 Data caching in fog

3 Related work

3.1 Utilizing fog computing in healthcare systems

3.2 Utilizing deep learning in predicting GDM

4 The proposed data replacement and prediction framework (DRPF)

4.1 IoT layer

4.2 Cloud layer

4.3 Fog layer

4.3.1 The proposed GDM module

4.3.1.1 Data finding methodology (DFM)

4.3.1.2 Explainable prediction algorithm (EPM) using DNN

5 Results

5.1 Statistical analysis

5.2 Evaluation of explainability of DL model

5.2.1 Global explainability

5.2.2 Local explainability

5.3 The performance metrics for DFM

6 Conclusion

Data availability

Change history

10 August 2023

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation