The role of machine learning algorithms in detection of gestational diabetes; a narrative review of current evidence

Kokori, Emmanuel; Olatunji, Gbolahan; Aderinto, Nicholas; Muogbo, Ifeanyichukwu; Ogieuhi, Ikponmwosa Jude; Isarinade, David; Ukoaka, Bonaventure; Akinmeji, Ayodeji; Ajayi, Irene; Chidiogo, Ezenwoba; Samuel, Owolabi; Nurudeen-Busari, Habeebat; Muili, Abdulbasit Opeyemi; Olawade, David B.

doi:10.1186/s40842-024-00176-7

The role of machine learning algorithms in detection of gestational diabetes; a narrative review of current evidence

Review article
Open access
Published: 25 June 2024

Volume 10, article number 18, (2024)
Cite this article

Download PDF

You have full access to this open access article

Clinical Diabetes and Endocrinology Aims and scope Submit manuscript

The role of machine learning algorithms in detection of gestational diabetes; a narrative review of current evidence

Download PDF

Emmanuel Kokori¹,
Gbolahan Olatunji¹,
Nicholas Aderinto ORCID: orcid.org/0000-0003-0004-7389²,
Ifeanyichukwu Muogbo²,
Ikponmwosa Jude Ogieuhi³,
David Isarinade¹,
Bonaventure Ukoaka⁴,
Ayodeji Akinmeji⁵,
Irene Ajayi¹,
Ezenwoba Chidiogo⁶,
Owolabi Samuel⁷,
Habeebat Nurudeen-Busari¹,
Abdulbasit Opeyemi Muili² &
…
David B. Olawade⁸

400 Accesses
Explore all metrics

Abstract

Gestational Diabetes Mellitus (GDM) poses significant health risks to mothers and infants. Early prediction and effective management are crucial to improving outcomes. Machine learning techniques have emerged as powerful tools for GDM prediction. This review compiles and analyses the available studies to highlight key findings and trends in the application of machine learning for GDM prediction. A comprehensive search of relevant studies published between 2000 and September 2023 was conducted. Fourteen studies were selected based on their focus on machine learning for GDM prediction. These studies were subjected to rigorous analysis to identify common themes and trends. The review revealed several key themes. Models capable of predicting GDM risk during the early stages of pregnancy were identified from the studies reviewed. Several studies underscored the necessity of tailoring predictive models to specific populations and demographic groups. These findings highlighted the limitations of uniform guidelines for diverse populations. Moreover, studies emphasised the value of integrating clinical data into GDM prediction models. This integration improved the treatment and care delivery for individuals diagnosed with GDM. While different machine learning models showed promise, selecting and weighing variables remains complex. The reviewed studies offer valuable insights into the complexities and potential solutions in GDM prediction using machine learning. The pursuit of accurate, early prediction models, the consideration of diverse populations, clinical data, and emerging data sources underscore the commitment of researchers to improve healthcare outcomes for pregnant individuals at risk of GDM.

Prediction models for the risk of gestational diabetes: a systematic review

Article Open access 08 February 2017

A Comprehensive Review on Machine Learning Approaches for Prediction of Gestational Diabetes Mellitus

Prediction of gestational diabetes based on nationwide electronic health records

Article 13 January 2020

Introduction

Gestational diabetes mellitus (GDM) is characterised by any degree of glucose intolerance that either develops or is first identified during pregnancy [1]. It encompasses cases of previously undiagnosed glucose intolerance that may have existed before or emerged during pregnancy, regardless of subsequent management approaches, such as dietary modification or insulin therapy, and whether the condition persists post-pregnancy [2]. Regional disparities in GDM prevalence are evident, with the highest rates found in the Middle East and North Africa (12.9%), followed by Southeast Asia (11.7%), the Western Pacific (11.7%), South and Central America (11.2%), and the lowest rates in Europe (5.8%), North America, and the Caribbean (7.0%) [3]. GDM is a widespread pregnancy complication, affecting 1–14% of pregnancies worldwide, with variations influenced by patient ethnicity and diagnostic criteria [4, 5]. The impact of GDM on maternal and fetal health is significant, often leading to preterm delivery, cesarean section, excessive fetal growth, hyperinsulinemia, hypoglycemia, and hyperbilirubinemia in newborns [6,7,8]. Additionally, GDM can progress to Type 2 Diabetes Mellitus (T2DM), resulting in birth-related complications, visceromegaly, fetal macrosomia, and an increased risk of metabolic disorders for both mother and child, including hypertension, obesity, and metabolic syndrome [9, 10].

The precise pathophysiological mechanisms of GDM remain incompletely understood, but hormonal imbalances, impaired insulin sensitivity, and pancreatic β-cell malfunction are suggested contributors [11]. About 16% of pregnancies globally are linked to hyperglycemia, with 84% classified as GDM [12]. GDM significantly contributes to the onset of T2DM in both mothers and offspring, emphasising the importance of effectively managing blood glucose levels during pregnancy to prevent and reduce the prevalence of T2D in future generations [13]. Historically, screening for GDM relied on medical history, previous obstetric outcomes, and family history of T2D. However, this approach exhibited an approximate 50% failure rate in detecting GDM among pregnant women. In 1973, a pivotal study recommended adopting the 50 g 1-h oral glucose tolerance test as a screening tool, which is now widely used by approximately 95% of obstetricians in the United States for GDM screening. In 2014, the U.S. Preventive Services Task Force (USPSTF) recommended GDM screening for all pregnant women at 24 weeks [12, 14, 15].

Early screening and diagnosis of GDM are crucial for reducing the risks of pregnancy-related complications, such as macrosomia, preterm birth, pre-eclampsia, and neonatal intensive care admissions [14, 16]. Existing diagnostic tools have limitations in this regard. To enhance the prediction of GDM, clinical, sociodemographic, and anthropometric data have been employed in traditional regression analysis-based clinical risk prediction models. Recent advancements in machine learning promise to increase the accuracy of disease perception, diagnosis, and management. For instance, Belsti et al. [17] used a predictive analysis on antenatal care records. Their model achieved 85% accuracy, 90% precision, 78% recall, 84% F1-score, 81% sensitivity, 90% specificity, 92% positive predictive value, 78% negative predictive value, and a Brier Score of 0.39, surpassing the performance of traditional statistical methods. Most outcome prediction models enable early intervention in high-risk women and cost-effective screening by identifying low-risk individuals, potentially eliminating the need for glucose tolerance tests [18]. This review explores the effectiveness of machine learning algorithms in detecting GDM, incorporating relevant studies and data on their application for GDM detection.

Methodology

Literature search strategy

A literature search was carried out to review the role of machine learning algorithms in the early detection of GDM and their impact on fetomaternal outcomes. The following databases were searched: PubMed, Scopus, Web of Science, and Google Scholar. The search was conducted for studies published between 2000 and September 2023. The following keywords were used (“machine learning”[MeSH Terms] OR (“machine”[All Fields] AND “learning”[All Fields]) OR “machine learning”[All Fields]) AND (“algorithms”[MeSH Terms] OR “algorithms”[All Fields]) AND (“diabetes, gestational”[MeSH Terms] OR (“diabetes”[All Fields] AND “gestational”[All Fields]) OR “gestational diabetes”[All Fields] OR (“gestational”[All Fields] AND “diabetes”[All Fields] AND “mellitus”[All Fields]) OR “gestational diabetes mellitus”[All Fields]).

Inclusion and exclusion criteria

Articles were included if they met the following criteria:

Published in English.
Peer-reviewed original studies.
Focused on applying machine learning algorithms in the context of GDM.
Included information on using machine learning in detecting or predicting GDM.

The exclusion criteria were:

Systematic analyses, meta-analyses, reviews, conference abstracts, case reports, editorials, and letters.
Studies that did not provide relevant information or data on the topic.

Study selection

Two independent reviewers (NA & EK) initially screened titles and abstracts to identify potentially relevant articles. Full-text articles were then retrieved for further evaluation. Discrepancies were resolved through discussion, and a third reviewer (GO) was consulted when necessary.

Data extraction

Data were extracted from the selected articles, including study design, sample size, characteristics of the study population, machine learning algorithms employed, predictive variables used, outcomes measured, and reported results.

Data synthesis

The findings from the selected studies were synthesised to provide an overview of the current evidence regarding the role of machine learning algorithms in the early detection of GDM and their impact on fetomaternal outcomes. Common themes, trends, and methodological differences were identified. Results were analysed and presented in a clear and organised manner.

Results

The studies in this review focused on predicting and detecting GDM through machine learning algorithms (See Table 1). Most were retrospective studies; others were cohort studies, and two were randomised clinical trials. The populations studied vary in size, from smaller cohorts of just a few thousand individuals to larger populations exceeding 30,000. The studies reviewed utilised diverse machine learning algorithms, including Naïve Bayes, Decision Trees, Support Vector Machines, Neural Networks, Logistic Regression, Lasso-Logistics, Gradient Boosting Decision Tree (GBDT), Deep Neural Network (DNN), Gaussian Naïve Bayes (GNB), Bernoulli Naïve Bayes (BNB), and various ensemble methods such as Light Gradient Boosting Machine (LGBM) and Extreme Gradient Boosting (XGBoost). Data sources include pregnancy registries, perinatal databases, clinical records, and data from health institutions or hospitals.

Table 1 Characteristics of reviewed studies

Full size table

Model performance and comparison

The studies conducted by Kang et al. (2023) and Yunzhen et al. (2020) demonstrated notable outcomes in terms of model performance and comparison [21, 32]. Kang et al. [21] conducted an analysis comparing the effectiveness of two machine learning algorithms, namely Light Gradient Boosting Machine (LGBM) and XGBoost, in predicting GDM. This study revealed that XGBoost consistently outperformed LGBM when evaluated across diverse cohorts and time points, positioning it as a promising choice for the prediction of GDM. In contrast, Yunzhen et al. [32] explored the potential of machine learning methods to surpass traditional logistic regression in GDM prediction. Their results, however, indicated that several machine-learning methods fell short of outperforming logistic regression.

Jie et al. [24] implemented diverse machine learning algorithms, including Logistic Regression, Gradient Boosting Decision Tree, XGBoost, and Lightgbm. The outcome was a model with high accuracy, precision, and recall, demonstrating the potential of these algorithms to enhance GDM prediction and risk assessment. Complementing this, Yang-Ting et al. [30] introduced a clinically cost-effective 7-variable Logistic Regression model. This simplified approach offers a promising avenue for GDM prediction, making it accessible and practical for clinical applications.

Early prediction

Gabriel et al. [19] develop employed Gaussian Naïve Bayes (GNB), Bernoulli Naïve Bayes (BNB), Decision Trees (DT), Support Vector Machines (SVMs), Multi-Layer Perceptron (MLP) to predict early prediction of GDM within the early stages of pregnancy through regular examinations. The results showed that the developed ML models and the proposed data augmentation method achieved excellent predictive performance for GDM. Similarly, Jenny et al. [23] introduced a novel machine learning-based stratification system. The study utilised linear and non-linear tree-based regression models, including XGBoost. The study demonstrated a straightforward method for implementing proportionate care delivery based on existing features in GDM clinics. The machine learning-based stratification system identified patients at risk of high blood glucose levels, enhancing the ability to tailor care interventions. Furthermore, Yi-xin et al. [22] utilised machine learning to forecast GDM risk with a moderate performance at pregnancy initiation, ultimately achieving good-to-excellent predictive capabilities by the end of the first trimester. The ML algorithm utilised in the study was XGBoost. The machine learning model demonstrated moderate performance in predicting GDM at pregnancy initiation and good-to-excellent performance at the first cohort’s end of the first trimester. However, in the second cohort, the trained XGBoost exhibited moderate performance. The primary objective of the prospective cohort study conducted by Jingyuan Wang et al. [33] was to develop and verify an early prediction model for GDM using machine learning algorithms. Various machine learning algorithms, including LR, Random Forest (RT), ANN, and SVM, were employed in the study. The study findings indicate that the constructed New-Stacking model theoretically aimed for optimal specificity, accuracy, and AUC. Nonetheless, the SVM model demonstrated superior performance, specifically in sensitivity.

Jesús et al. [20] conducted a study to address the barriers to early detection of GDM in pregnant Mexican women. The study employed a machine-learning-driven method to select the best predictive variables for GDM risk. The identified variables included age, family history of type 2 diabetes, previous diagnosis of hypertension, pregestational body mass index, gestational week, parity, birth weight of the last child, and random capillary glucose.

Subsequently, an artificial neural network approach was used to build the AI-based prediction model. The developed model demonstrated a high level of accuracy, reaching 70.3%, and sensitivity, achieving 83.3%. These results indicate the model’s effectiveness in identifying pregnant women at high risk of developing GDM. Moreover, de Freitas et al. [31] conducted a study aiming to characterise GDM in pregnant women better using Attenuated Total Reflection Fourier-transform infrared (ATR-FTIR) spectroscopy. The study employed chemometric approaches, integrating feature selection algorithms along with discriminant analysis methods such as Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), and Support Vector Machines (SVM). The results obtained by Genetic Algorithm Linear Discriminant Analysis (GA-LDA) were reported as the most satisfactory, achieving % accuracy, sensitivity, and specificity of 100%.

Results in diverse populations

Mukkesh Kumar et al. [26] conducted a cohort study to evaluate the predictive ability of the existing UK National Institute for Health and Care Excellence (NICE) guidelines for assessing GDM using machine learning. This study employed the CatBoost gradient boosting algorithm and the Shapley feature attribution framework for predictive modelling. The findings of the study revealed that the existing UK NICE guidelines were insufficient to assess GDM risk in Asian women. Furthermore, the non-invasive predictive model developed in this study demonstrated superior performance to the current state-of-the-art machine learning models in predicting GDM. Similarly, Mukkesh Kumar et al. [27] built a preconception-based GDM predictor to enable early intervention. Additionally, the study aimed to assess the associations of top predictors with GDM and adverse birth outcomes. Participants were recruited from multi-ethnic groups (Chinese, Malay, Indian, or any combination of these three ethnicities). The study employed an evolutionary algorithm-based automated machine learning (AutoML) approach, incorporating the SHAP (SHapley Additive exPlanations) framework and TPOT (Tree-based Pipeline Optimization Tool). The study successfully devised a population-based predictive care solution, utilising an AutoML approach, to assess the risk of developing GDM among Asian women in the preconception period. While effective in some contexts, their findings revealed that these algorithms proved insufficient for accurately assessing GDM risk in some ethnic groups of women. This study highlights the need for population-specific considerations when addressing GDM.

Predictive models for specific cohorts

Yuhan et al. [28] conducted a Randomized Clinical Trial to apply machine learning techniques to develop a Clinical Decision Support System (CDSS). The objective was to predict the risk of Gestational Diabetes Mellitus (GDM), specifically in a high-risk group of women with overweight and obesity.. The study employed both Random Forest and Logistic Regression models for prediction. The study successfully developed a simple yet effective model utilising machine learning algorithms to predict the risk of GDM in the first trimester. Notably, the model achieved this without relying on blood examination indexes. Li-Li et al. [29] conducted a retrospective study to investigate the application of a machine learning algorithm for predicting GDM in early pregnancy. The machine learning algorithm employed in the study was the Random Forest regression algorithm. Notably, the model identified body weight at birth and the mother’s weight as strongly predictive variables for GDM. Additionally, other variables such as colpomycosis, kidney disease, the number of births by the mother, regular menstruation, blood type, and hepatitis consistently ranked among the top 20 most influential factors. They were found to be linked to GDM in the study.

Clinical data and treatment modality

Lauren et al. [25] conducted a population-based cohort study to investigate whether clinical data at different stages of pregnancy could predict the treatment modality for GDM. The focus of the study was on predicting the risks for pharmacologic treatment beyond medical nutrition therapy (MNT) for pregnant women diagnosed with GDM. The study employed transparent and ensemble machine learning methods for predictive modelling, incorporating LASSO regression and a super learner. The super learner included classification, regression tree, LASSO regression, random forest, and extreme gradient boosting algorithms. The study’s findings demonstrated reasonably high predictability for GDM treatment modality at GDM diagnosis and maintained high predictability at 1-week post-GDM diagnosis. In parallel, Jenny et al. [23] demonstrated the development of an innovative method for implementing proportionate care delivery based on existing features within GDM clinics. For predictive modelling, the study employed linear and non-linear tree-based regression models, including metrics such as XGBoost MSE (Mean Squared Error), R2 (R-squared), and MAE (Mean Absolute Error). The findings suggest that such a machine learning-based stratification system could provide an effective and practical approach for tailoring care interventions based on existing features within GDM clinics, potentially improving patient outcomes and resource allocation.

Discussion

The studies reviewed here encompass various methodologies, underlining the multifaceted nature of GDM prediction. One striking trend within this collection of studies is the detailed comparison of machine learning algorithms. Algorithms like XGBoost and Logistic Regression have demonstrated their effectiveness in GDM prediction [29]. However, it is essential to recognise that there is no one-size-fits-all solution. While XGBoost displayed superiority in several studies, comprehending the strengths and weaknesses of different algorithms becomes crucial for optimising predictive models within various contexts.

The importance of early prediction for effective GDM management cannot be overstated, and it is evident in the significant emphasis placed on this aspect in the reviewed studies [25, 34] (Fig. 1). The rationale behind early prediction lies in the potential to initiate timely interventions and provide personalised care to pregnant women at risk of developing GDM. The complications associated with GDM can have profound and long-lasting effects on both the mother and child, making early detection a critical component of effective healthcare [35]. This emphasis on early prediction is reflected in the proliferation of diverse models designed to forecast GDM risk during the early stages of pregnancy. The variety of models exemplified by the comprehensive work of Gabriel Cubillos et al. [19] underscores the collective ambition within the scientific community to enhance the accuracy and reliability of GDM predictions. The study by Gabriel Cubillos and their team is particularly noteworthy as it prioritised early prediction and explored the potential of different machine-learning models [19]. They expanded the toolkit for healthcare providers and researchers by developing and optimising twelve distinct models. These models are fine-tuned to deliver high prediction performance during the early stages of pregnancy. This multi-pronged approach allows for more comprehensive risk assessment, increasing the chances of timely interventions. The focus on early prediction is not only about identifying cases but also about developing a deeper understanding of the factors and variables that contribute to the development of GDM [36]. By emphasising the importance of early detection, these studies pave the way for tailoring interventions that can prevent or mitigate the impact of GDM. The ultimate goal is to improve maternal and fetal health outcomes by making proactive, personalised care a standard practice in obstetrics.

Studies within this review underscore the importance of tailoring predictive models to specific populations and demographic groups when addressing the prediction and early detection of GDM [19, 23, 30]. These studies highlight that a one-size-fits-all approach is insufficient, and demographic-specific considerations are essential for constructing accurate predictive models. Mukkesh Kumar et al. [26] have made a particularly striking contribution by shedding light on the limitations of employing uniform guidelines for diverse populations, specifically emphasising the challenges faced by Asian women. Their findings reveal that traditional, broadly applicable guidelines may not adequately capture the unique risk factors and nuances associated with GDM in Asian populations. This study emphasises the necessity of considering ethnicity, genetics, and other demographic-specific factors when constructing predictive models for GDM. By doing so, healthcare providers can better identify at-risk individuals within these populations and tailor interventions and care strategies to their specific needs. Similarly, the research conducted by Yuhan Du et al. (2022) provides a compelling illustration of the potential for augmenting prediction accuracy by focusing on high-risk groups [23]. In this case, the study zeroes in on women who are overweight or obese, a demographic with a higher susceptibility to GDM. By developing a specialised clinical decision support system for this specific cohort, the study recognises the unique risk profile of these individuals. This targeted approach can enhance prediction accuracy, ensuring women at the highest risk receive the necessary attention, interventions, and care. These findings indicate the importance of healthcare equity, emphasising that predictive models must be sensitive to the diversity of the populations they serve. The one-size-fits-all approach is no longer adequate, as demographic factors significantly determine GDM risk. Future research and healthcare initiatives should consider these demographic-specific considerations when designing predictive models, ultimately leading to more accurate risk assessment and better-tailored interventions.

Lauren et al. (2022) and Jenny et al. (2022) made substantial contributions to the field by emphasising the importance of integrating clinical data into the predictive models for GDM [23, 25]. These studies provide valuable insights into how leveraging clinical data can enhance the treatment and care delivery for individuals diagnosed with GDM, ultimately improving patient outcomes. The integration of clinical data into predictive models offers several crucial advantages. First and foremost, it enables healthcare providers to personalise and optimise the treatment and care for pregnant individuals diagnosed with GDM. By considering clinical data such as responsiveness to medical nutrition therapy, they can tailor interventions to each patient’s specific needs. This individualised approach is essential, as GDM management can vary significantly from one person to another [37]. Furthermore, incorporating clinical data fosters a more patient-centred approach to care. It ensures that the treatment plan aligns with the patient’s specific health profile, preferences, and response to interventions. This patient-centred approach can improve patient satisfaction, compliance, and overall well-being. Jenny et al. [23] introduced the concept of proportionate care delivery based on available clinical data. This innovative approach streamlines care and ensures that resources are allocated efficiently, addressing patients’ needs more effectively [30]. By leveraging existing clinical data, healthcare providers can identify individuals at risk of high blood glucose levels, enabling proactive intervention and reducing the likelihood of complications associated with uncontrolled GDM.

Nonetheless, it is essential to acknowledge that challenges persist within GDM prediction. A common challenge encountered is the extensive array of variables associated with GDM [38]. The condition’s multifaceted nature means numerous factors must be considered, making selecting and weighing these variables complex [39]. While studies like that of Jie et al. (2022) have demonstrated the potential of different machine-learning models, addressing this variable complexity remains a significant challenge [26]. Researchers must continue refining their models and methodologies to accurately incorporate the full spectrum of relevant variables. Moreover, applying ensemble methods, such as stacking, underscores the aspiration to enhance predictive performance. While these methods promise to improve accuracy, they also introduce additional layers of complexity. Studies must balance model sophistication and practicality, ensuring that predictive models can be effectively implemented in real-world clinical settings.

As technology and healthcare data evolve, future research can leverage emerging opportunities. Integrating real-time data from wearable devices, exploring genetic data, and incorporating a more comprehensive range of health-related information are all promising avenues for improving predictive models. These advanced data sources have the potential to provide a more holistic understanding of GDM risk, leading to more accurate and timely predictions [40]. Furthermore, future research should consider the holistic context in which GDM occurs. Focusing on patient-centred outcomes and the social determinants of GDM can deepen our understanding of this condition. Factors such as access to healthcare, socioeconomic status, and lifestyle can significantly impact an individual’s risk of developing GDM. By considering these broader determinants, researchers and healthcare providers can develop more comprehensive and effective management strategies that address the medical aspects and the social and environmental factors influencing GDM.

Limitations and strengths of review

This review explores various studies on predicting and detecting GDM through machine learning methods. It encompasses a wide range of study designs, population groups, and machine learning algorithms, providing an inclusive overview of this field’s current state of research. However, the studies included in this review span across different geographical regions and demographic profiles. While this diversity enriches the scope of the review, it can simultaneously limit the generalizability of findings. GDM risk factors and predictive models may exhibit variations among populations, and the review would benefit from a more thorough discussion of the implications arising from this variability. Additionally, this review primarily relies on studies published in English, which might introduce publication bias, potentially overlooking negative or inconclusive results less readily available in English literature.

Conclusion

Predicting and early detecting GDM through machine learning techniques is a dynamic and evolving field. This review shows significant findings and trends across diverse studies, shedding light on the potential and challenges within this domain. The significance of early prediction in facilitating effective GDM management is striking, with numerous studies committed to crafting models capable of identifying GDM risk in the early stages of pregnancy. XGBoost emerged prominently as a consistent performer, showcasing superior predictive capabilities across various cohorts and time points. These models create opportunities for timely interventions and personalised care, ultimately improving outcomes for both mothers and infants. Nevertheless, the challenges at hand are notable. The vast array of variables associated with GDM poses a substantial hurdle in the quest for accurate prediction models. The selection and weighting of these variables remain intricate tasks, necessitating ongoing research and innovation in feature engineering. Furthermore, the emphasis on tailoring predictive models to specific populations, evident in studies focusing on Asian women or high-risk groups, underscores the importance of demographic-specific considerations. Predictive models must adapt to these groups’ unique characteristics and risk factors. The practicality of implementing proportionate care delivery based on readily available clinical data underscores the value of leveraging existing resources effectively. As technology and healthcare data continue to advance, there is an opportunity for future research to harness real-time data from wearable devices and genetic information to enhance predictive models further. These emerging data sources could revolutionise GDM prediction and early intervention. Focusing on patient-centred outcomes and exploring the role of social determinants in GDM prediction can deepen our understanding of this condition. It can pave the way for more comprehensive and effective management strategies considering medical variables and broader contexts in which GDM occurs. This review offers valuable insights and directions for future studies in GDM prediction through machine learning techniques.

Availability of data and materials

No new datasets were generated for this study. All data used are within this manuscript.

Abbreviations

GDM:: Gestational Diabetes Mellitus
T2DM:: Type 2 Diabetes Mellitus
USPSTF:: U.S. Preventive Services Task Force
SHAP:: SHapley Additive exPlanations
TPOT:: Tree-based Pipeline Optimization Tool
MeSH:: Medical Subject Headings
MSE:: Mean Squared Error
R2:: R-squared (coefficient of determination)
MAE:: Mean Absolute Error
BMI:: Body Mass Index
OGTT:: Oral Glucose Tolerance Test
LGBM:: Light Gradient Boosting Machine
XGBoost:: Extreme Gradient Boosting
SVM:: Support Vector Machines
MLP:: Multi-Layer Perceptron
KNN:: K-Nearest Neighbors
LR:: Logistic Regression
RF:: Random Forest
ET:: Extra Trees
BRF:: Balanced Random Forest
GB:: Gradient Boosting
ANN:: Artificial Neural Network
HbA1c:: Hemoglobin A1c
FPG:: Fasting Plasma Glucose
TG:: Triglyceride
TC:: Total Cholesterol
HDL:: High-Density Lipoprotein
SHAP:: SHapley Additive exPlanations
TPOT:: Tree-based Pipeline Optimization Tool

References

Metzger BE, Coustan DR. Summary and recommendations of the Fourth International Workshop-Conference on Gestational Diabetes Mellitus. The Organizing Committee. Diabetes Care. 1998;21 Suppl 2:B161–7.
CAS PubMed Google Scholar
American Diabetes Association. Gestational diabetes mellitus. Diabetes Care. 2004;7 Suppl 1:S88-90. https://doi.org/10.2337/diacare.27.2007.s88. PMID: 14693936.
Article Google Scholar
Zhu Y, Zhang C. Prevalence of gestational diabetes and risk of progression to type 2 diabetes: a global perspective. Curr Diab Rep. 2016;16:7. https://doi.org/10.1007/s11892-015-0699-x.
Article CAS PubMed PubMed Central Google Scholar
Siddiqui K, George TP. Resistin role in development of gestational diabetes mellitus. Biomark Med. 2017;11(7):579–86. https://doi.org/10.2217/bmm-2017-0013.
Article CAS PubMed Google Scholar
Katra P, Dereke J, Nilsson C, Hillman M. Plasma levels of the interleukin-1-receptor antagonist are lower in women with gestational diabetes mellitus and are particularly associated with postpartum development of type 2 diabetes. PLoS One. 2016;11(5):e0155701. https://doi.org/10.1371/journal.pone.0155701.
Article CAS PubMed PubMed Central Google Scholar
Ferrocino I, Ponzo V, Gambino R, Zarovska A, Leone F, Monzeglio C, et al. Changes in the gut microbiota composition during pregnancy in patients with gestational diabetes mellitus (GDM). Sci Rep. 2018;8(1):12216. https://doi.org/10.1038/s41598-018-30735-9.
Article ADS CAS PubMed PubMed Central Google Scholar
Chanda S, Dogra V, Hazarika N, Bambrah H, Sudke AK, Vig A, et al. Prevalence and predictors of gestational diabetes mellitus in rural Assam: a cross-sectional study using mobile medical units. BMJ Open. 2020;10(11):e037836. https://doi.org/10.1136/bmjopen-2020-037836.
Article PubMed PubMed Central Google Scholar
Wani K, Sabico S, Alnaami AM, Al-Musharaf S, Fouda MA, Turkestani IZ, et al. Early-pregnancy metabolic syndrome and subsequent incidence in gestational diabetes mellitus in Arab women. Front Endocrinol. 2020;11:98. https://doi.org/10.3389/fendo.2020.00098.
Article Google Scholar
Zhu H, Chen B, Cheng Y, Zhou Y, Yan Y-S, Luo Q, et al. Insulin therapy for gestational diabetes mellitus does not fully protect offspring from diet-induced metabolic disorders. Diabetes. 2019;68(4):696–708. https://doi.org/10.2337/db18-1151.
Article CAS PubMed Google Scholar
Choudhury AA, Devi RV. Gestational diabetes mellitus - a metabolic and reproductive disorder. Biomed Pharmacother. 2021;143:112183. https://doi.org/10.1016/j.biopha.2021.112183.
Article CAS PubMed Google Scholar
Kim HY, Kim J, Noh E, Ahn KH, Cho GJ, Hong S, Oh M, Kim H. Prepregnancy hemoglobin levels and gestational diabetes mellitus in pregnancy. Diabetes Res Clin Pract. 2021;171:1–7. https://doi.org/10.1016/j.diabres.2020.108608.
Article CAS Google Scholar
Jawad F, Ejaz K. Gestational diabetes mellitus in South Asia: epidemiology. J Pak Med Assoc. 2016;66(9 Suppl 1):S5-7.
PubMed Google Scholar
Poon LC, Simmons D, Hyett JA, Da Fonseca EB, Hod M. The first-trimester of pregnancy—a window of opportunity for prediction and prevention of pregnancy complications and future life. Diabetes Res Clin Pract. 2018;145:20–30. https://doi.org/10.1016/j.diabres.2018.05.002.
Article PubMed Google Scholar
Spaight C, Gross J, Horsch A, Puder JJ. Gestational diabetes mellitus. Endocr Dev. 2016;31:163–78.
Article CAS PubMed Google Scholar
ACOG practice bulletin no. 190: gestational diabetes mellitus. Obstet Gynecol. 2018;131(2):e49–64. https://doi.org/10.1097/AOG.0000000000002501.
Periyathambi N, Parkhi D, Ghebremichael-Weldeselassie Y, Patel V, Sukumar N, Siddharthan R, Narlikar L, Saravanan P. Machine learning prediction of non-attendance to postpartum glucose screening and subsequent risk of type 2 diabetes following gestational diabetes. PLoS One. 2022;17(3):e0264648. https://doi.org/10.1371/journal.pone.0264648.
Article CAS PubMed PubMed Central Google Scholar
Belsti Y, Moran L, Du L, Mousa A, De Silva K, Enticott J, et al. Comparison of machine learning and conventional logistic regression-based prediction models for gestational diabetes in an ethnically diverse population; the Monash GDM Machine learning model. Int J Med Inf. 2023;179:105228.
Article Google Scholar
Gupta V, Gill S, Sandhu JK, Sahu R. Comparative study of machine learning models for early gestational diabetes mellitus. In: 2023 International Conference on Circuit Power and Computing Technologies (ICCPCT). 2023. p. 1761–6. Available from: https://ieeexplore.ieee.org/abstract/document/10244924. Cited 2023 Oct 11.
Cubillos G, Monckeberg M, Plaza A, et al. Development of machine learning models to predict gestational diabetes risk in the first half of pregnancy. BMC Pregnancy Childbirth. 2023;23(1):469. https://doi.org/10.1186/s12884-023-05766-4. Published 2023 Jun 23.
Article PubMed PubMed Central Google Scholar
Jesús M, Montoya A, Alberto L, Marcela L, Lorena I, Alberto D, Javier F, Ávila EO, Efraín A, Concepción M, Muñoz ER. MIDO GDM: an innovative artificial intelligence-based prediction model for the development of gestational diabetes in Mexican women. Sci Rep. 2023;13(1):1–11. https://doi.org/10.1038/s41598-023-34126-7.
Article CAS Google Scholar
Kang BS, Lee SU, Hong S, et al. Prediction of gestational diabetes mellitus in Asian women using machine learning algorithms. Sci Rep. 2023;13(1):13356. https://doi.org/10.1038/s41598-023-39680-8. Published 2023 Aug 16.
Article ADS CAS PubMed PubMed Central Google Scholar
Li YX, Liu YC, Wang M, Huang YL. Prediction of gestational diabetes mellitus at the first trimester: machine-learning algorithms [published online ahead of print, 2023 Jul 21]. Arch Gynecol Obstet. 2023: https://doi.org/10.1007/s00404-023-07131-4. https://doi.org/10.1007/s00404-023-07131-4.
Yang J, Clifton D, Hirst JE, et al. Machine learning-based risk stratification for gestational diabetes management. Sensors (Basel). 2022;22(13):4805. https://doi.org/10.3390/s22134805. Published 2022 Jun 25.
Article ADS PubMed Google Scholar
Zhang J, Wang F. Prediction of gestational diabetes mellitus under cascade and ensemble learning algorithm. Comput Intell Neurosci. 2022;2022:3212738. https://doi.org/10.1155/2022/3212738. Published 2022 Jul 14.
Article PubMed PubMed Central Google Scholar
Liao LD, Ferrara A, Greenberg MB, et al. Development and validation of prediction models for gestational diabetes treatment modality using supervised machine learning: a population-based cohort study. BMC Med. 2022;20(1):307. https://doi.org/10.1186/s12916-022-02499-7. Published 2022 Sep 15.
Article CAS PubMed PubMed Central Google Scholar
Kumar M, Chen L, Tan K, et al. Population-centric risk prediction modeling for gestational diabetes mellitus: a machine learning approach. Diabetes Res Clin Pract. 2022;185:109237. https://doi.org/10.1016/j.diabres.2022.109237.
Article PubMed PubMed Central Google Scholar
Kumar M, Ang LT, Png H, et al. Automated machine learning (AutoML)-derived preconception predictive risk model to guide early intervention for gestational diabetes mellitus. Int J Environ Res Public Health. 2022;19(11):6792. https://doi.org/10.3390/ijerph19116792. Published 2022 Jun 1.
Article CAS PubMed PubMed Central Google Scholar
Du Y, Rafferty AR, McAuliffe FM, Wei L, Mooney C. An explainable machine learning-based clinical decision support system for prediction of gestational diabetes mellitus. Sci Rep. 2022;12(1):1170. https://doi.org/10.1038/s41598-022-05112-2. Published 2022 Jan 21.
Article ADS CAS PubMed PubMed Central Google Scholar
Wei L, Pan Y, Zhang Y, Chen K, Wang H, Wang J. Application of machine learning algorithm for predicting gestational diabetes mellitus in early pregnancy†. Front Nurs. 2021;8(3):209–21. https://doi.org/10.2478/fon-2021-0022.
Article ADS Google Scholar
Wu YT, Zhang CJ, Mol BW, et al. Early prediction of gestational diabetes mellitus in the Chinese population via advanced machine learning. J Clin Endocrinol Metab. 2021;106(3):e1191–205. https://doi.org/10.1210/clinem/dgaa899.
Article PubMed Google Scholar
De Freitas DL, De Morais CD, Cornetta MD, Camargo JD, De Lima KM, Crispim JC. Spectrochemical differentiation in gestational diabetes mellitus based on attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy and multivariate analysis. Sci Rep. 2020;10(1):1–10. https://doi.org/10.1038/s41598-020-75539-y.
Article CAS Google Scholar
Ye Y, Xiong Y, Zhou Q, Wu J, Li X, Xiao X. Comparison of machine learning methods and conventional logistic regressions for predicting gestational diabetes using routine clinical data: a retrospective cohort study. J Diabetes Res. 2020;2020:4168340. https://doi.org/10.1155/2020/4168340. Published 2020 Jun 12.
Article CAS PubMed PubMed Central Google Scholar
Wang J, Chen X, Pan Y, Chen K, Zhang Y, Li Q, et al. Machine learning approaches for early prediction of gestational diabetes mellitus based on prospective cohort study. 2021;1(14). https://doi.org/10.21203/rs.3.rs-508626/v1.
Gündoğdu S. Efficient prediction of early-stage diabetes using XGBoost classifier with random forest feature selection technique. Multimed Tools Appl. 2023;82:34163–81. https://doi.org/10.1007/s11042-023-15165-8.
Article Google Scholar
Wu S, Li L, Hu KL, Wang S, Zhang R, Chen R, Liu L, Wang D, Pan M, Zhu B, Wang Y, Yuan C, Zhang D. A prediction model of gestational diabetes mellitus based on OGTT in early pregnancy: a prospective cohort study. J Clin Endocrinol Metab. 2023;108(8):1998–2006. https://doi.org/10.1210/clinem/dgad052.
Article PubMed Google Scholar
Correa PJ, Venegas P, Palmeiro Y, Albers D, Rice G, Roa J, Cortez J, Monckeberg M, Schepeler M, Osorio E, Illanes SE. First trimester prediction of gestational diabetes mellitus using plasma biomarkers: a case-control study. J Perinat Med. 2019;47(2):161–8. https://doi.org/10.1515/jpm-2018-0120.
Article CAS PubMed Google Scholar
Plows JF, Stanley JL, Baker PN, Reynolds CM, Vickers MH. The pathophysiology of gestational diabetes mellitus. Int J Mol Sci. 2018;19(11):3342. https://doi.org/10.3390/ijms19113342.
Article CAS PubMed PubMed Central Google Scholar
Zhang Y, Xiao CM, Zhang Y, Chen Q, Zhang XQ, Li XF, Shao RY, Gao YM. Factors associated with gestational diabetes mellitus: a meta-analysis. J Diabetes Res. 2021;2021:6692695. https://doi.org/10.1155/2021/6692695.
Article PubMed PubMed Central Google Scholar
Oskovi-Kaplan ZA, Ozgu-Erdinc AS. Management of gestational diabetes mellitus. Adv Exp Med Biol. 2021;1307:257–72. https://doi.org/10.1007/5584_2020_552.
Article CAS PubMed Google Scholar
Avvisato R, Forzano I, Varzideh F, Mone P, Santulli G. A machine learning model identifies a functional connectome signature that predicts blood pressure levels: imaging insights from a large population of 35 882 patients. Cardiovasc Res. 2023;119(7):1458–60. https://doi.org/10.1093/cvr/cvad065.
Article CAS PubMed Google Scholar

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Department of Medicine and Surgery, University of Ilorin, Ilorin, PMB 5000, Nigeria
Emmanuel Kokori, Gbolahan Olatunji, David Isarinade, Irene Ajayi & Habeebat Nurudeen-Busari
Department of Medicine, Ladoke Akintola University of Technology, Ogbomoso, Nigeria
Nicholas Aderinto, Ifeanyichukwu Muogbo & Abdulbasit Opeyemi Muili
Department of Medicine, Siberian State Medical University, Tomsk, Russia
Ikponmwosa Jude Ogieuhi
Department of Internal Medicine, Asokoro District Hospital, Abuja, Nigeria
Bonaventure Ukoaka
Department of Medicine and Surgery, Olabisi Onabanjo University, Ogun, Nigeria
Ayodeji Akinmeji
Department of Medicine and Surgery, AfeBabalola University, Ado-Ekiti, Nigeria
Ezenwoba Chidiogo
Department of Medicine, Lagos State Health Service Commission, Lagos, Nigeria
Owolabi Samuel
Department of Allied and Public Health, School of Health, Sport and Bioscience, University of East London, London, UK
David B. Olawade

Authors

Emmanuel Kokori
View author publications
You can also search for this author in PubMed Google Scholar
Gbolahan Olatunji
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Aderinto
View author publications
You can also search for this author in PubMed Google Scholar
Ifeanyichukwu Muogbo
View author publications
You can also search for this author in PubMed Google Scholar
Ikponmwosa Jude Ogieuhi
View author publications
You can also search for this author in PubMed Google Scholar
David Isarinade
View author publications
You can also search for this author in PubMed Google Scholar
Bonaventure Ukoaka
View author publications
You can also search for this author in PubMed Google Scholar
Ayodeji Akinmeji
View author publications
You can also search for this author in PubMed Google Scholar
Irene Ajayi
View author publications
You can also search for this author in PubMed Google Scholar
Ezenwoba Chidiogo
View author publications
You can also search for this author in PubMed Google Scholar
Owolabi Samuel
View author publications
You can also search for this author in PubMed Google Scholar
Habeebat Nurudeen-Busari
View author publications
You can also search for this author in PubMed Google Scholar
Abdulbasit Opeyemi Muili
View author publications
You can also search for this author in PubMed Google Scholar
David B. Olawade
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. Nicholas Aderinto, Gbolahan Olatunji and Emmanuel Kokori performed material preparation, data collection and analysis. All authors wrote the first draft of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nicholas Aderinto.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Kokori, E., Olatunji, G., Aderinto, N. et al. The role of machine learning algorithms in detection of gestational diabetes; a narrative review of current evidence. Clin Diabetes Endocrinol 10, 18 (2024). https://doi.org/10.1186/s40842-024-00176-7

Download citation

Received: 19 January 2024
Accepted: 20 February 2024
Published: 25 June 2024
DOI: https://doi.org/10.1186/s40842-024-00176-7

The role of machine learning algorithms in detection of gestational diabetes; a narrative review of current evidence

Abstract

Similar content being viewed by others

Prediction models for the risk of gestational diabetes: a systematic review

A Comprehensive Review on Machine Learning Approaches for Prediction of Gestational Diabetes Mellitus

Prediction of gestational diabetes based on nationwide electronic health records

Introduction

Methodology

Literature search strategy

Inclusion and exclusion criteria

Study selection

Data extraction

Data synthesis

Results

Model performance and comparison

Early prediction

Results in diverse populations

Predictive models for specific cohorts

Clinical data and treatment modality

Discussion

Limitations and strengths of review

Conclusion

Availability of data and materials

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation