Gestational Diabetes Prevalence Estimates from Three Data Sources, 2018

Introduction We investigated 2018 gestational diabetes mellitus (GDM) prevalence estimates in three surveillance systems (National Vital Statistics System, State Inpatient Database, and Pregnancy Risk Assessment Monitoring Survey). Methods We calculated GDM prevalence for jurisdictions represented in each system; a subset of data was analyzed for people 18–39 years old in 22 jurisdictions present in all three systems to observe dataset-specific demographics and GDM prevalence using comparable categories. Results GDM prevalence estimates varied widely by data system and within the data subset despite comparable demographics. Discussion Understanding the differences between GDM surveillance data systems can help researchers better identify people and places at higher risk of GDM.

data were available in all three data systems.We additionally examined demographic characteristics of the analytic subset.An understanding of the differences between GDM surveillance data systems, including their strengths and limitations, can help researchers better identify people and places at higher risk of GDM.

Methods
GDM prevalence was estimated using 2018 data from three surveillance systems: Centers for Disease Control  Birth certificates are a complete enumeration of US births and are compiled in the NVSS. 1 Birth certificates indicate the presence of GDM based on medical records.We queried the NVSS using CDC Wonder (https://wonder.cdc.gov/) for US states and the District of Columbia (DC).GDM prevalence was calculated as GDM-associated births divided by total live births, using only singletons or the first birth for multiple gestations to avoid overcounting people with multiple births.
The SID is an unweighted census of more than 95% of hospital discharge records at the state-level. 2Live births with administrative claims code(s) for GDM were identified using diagnosis-related group (DRG) and International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes. 3GDM prevalence was calculated for the available 27 states and DC, using total live hospital births with GDM divided by total live hospital births.
PRAMS is population-based survey of a sample of people with live births. 4Participants are sampled from birth certificates, with each participating jurisdiction sampling 1 https://wonder.cdc.gov/natality.html.
We also analyzed a subset of people for which data were available in all three data systems (limited to 22 jurisdictions and to ages 18-39 years due to small sample sizes of younger and older age groups) to examine demographic characteristics.All analyses were performed using SAS (version 9.4; SAS Institute), accounting for complex sampling and weighting in PRAMS.
GDM prevalence estimates varied widely by data system [Fig.2].Among 45 jurisdictions with data in more than one data system, discrepancies ranged from 0.1% (Massachusetts and Nevada) to 5.6% (West Virginia).Among 24 jurisdictions with data in all three systems, West Virginia had the largest range in prevalence (6.1-11.7%),while Colorado (5.3-6.4%) and Iowa (7.6-8.7%) had the smallest ranges across the three systems.
When analyzing subsets of people aged 18-39 years in the 22 jurisdictions for which data were available in all three systems, dataset-wide GDM still varied across the three data systems: 6.6% (NVSS), 8.0% (SID), and 9.0% (PRAMS; [95% CI: 8.5-9.6%].Apart from race and Hispanic ethnicity, participant demographics were similar across data systems [Table 1].A lower proportion of births in the NVSS were to non-Hispanic (NH) people of another race or more than one race compared with SID and PRAMS.A lower proportion of births in the SID were to Hispanic and NH Asian/Pacific Islander people compared to the other data systems.A higher proportion of births in PRAMS were to NH Black, NH White, and NH people of another race or more than one race.a higher proportion of births were missing race and ethnicity data in the SID (3.0%) and PRAMS (2.0%) compared to NVSS (0.7%).

Discussion
We document variation in GDM prevalence estimates in three data systems, and these variations exist even when comparing subsets of participants with similar demographic characteristics.GDM prevalence estimates appear to be influenced by the strengths and limitations of each data system, including representativeness of the data, varying demographic and geographic coverage, recall bias for surveys, and completeness of documentation in administrative data.
NVSS data provide a complete enumeration of live births in US states and territories and contain detailed race and ethnicity information.However, NVSS may underreport GDM; studies have documented a low sensitivity relative to medical records (46-75.7%)(Gregory et al., 2019; Dietz  et al., 2015; Devlin et al., 2009).Data quality also varies across hospitals (Gregory et al., 2019).Further studies might explore how to improve documentation of GDM on birth certificates.
SID data have several strengths, including that they encompass more than 95% of US community hospital discharges.However, jurisdiction participation changes over time. 5In addition, SID is derived from claims data, which are subject to coding errors (e.g., missing and inaccurate data).Further, unlike NVSS and PRAMS, the SID only includes hospital births.While NVSS indicates that < 2% of births nationwide occurred outside of a hospital in 2018, some juridictions have higher rates of non-hospital births, such as Alaska (7%).Pregnancies affected by GDM are at increased risk of adverse outcomes and may require hospital-based management during delivery, which may result in higher prevalence of GDM in the SID than might be seen in other birth settings.South Dakota, Rhode Island, West Virginia, and DC had more hospital deliveries identified in the SID than live births in birth certificate data-for unknown reasons.Finally, SID has fewer race and ethnicity groupings relative to the NVSS and PRAMS, limiting our results to Hispanic and non-Hispanic single race categories.
PRAMS offers detailed survey data on health before, during, and after a live birth, which are linked to demographic data in birth certificates.However, jurisdiction participation in the survey varies year to year. 6In addition, survey response rates vary by jurisdiction and have been decreasing over time; seven states participating in 2018 did not meet the 50% response rate criteria.PRAMS may also be subject to biases associated with self-reported data (DeSisto  et al., 2014; Dietz et al., 2014).Nevertheless, a 2014 validation study of PRAMS data found moderate sensitivity and excellent specificity, but poor positive predictive values for self-reported GDM, compared to medical records, indicating that self-report of GDM may indeed be a valid information source (Dietz et al., 2014).
6 https://www.cdc.gov/prams/prams-data/researchers.htm#response.This study has two primary limitations.First, it carries over the limitations of the original data systems.Second, we caution against comparing GDM prevalence estimates across the data systems, because each uses different methodologies and includes different populations.
Nevertheless, this study provides a description of GDM prevalence data from three data systems and considers their strengths and limitations as data sources for GDM surveillance.
GDM screening combined with high-quality surveillance data can inform public health efforts to prevent, identify, and manage this potentially serious pregnancy complication.Improved surveillance, such as through hospital-based quality improvement initiatives, is needed to increase the accuracy of the data and to better identify the disproportionate burden of GDM across geographic and demographic factors.However, all three data systems can provide a useful

Fig. 1
Fig. 1 Gestational diabetes mellitus (GDM) prevalence estimates for 2018 as indicated by National Vital Statistics System (NVSS) (a), State Inpatient Database (SID) (b), and Pregnancy Risk Assessment Monitoring System (PRAMS) data (c).Greyed-out jurisdictions do not have data available for analysis.Maps created at mapchart.net

Fig. 2
Fig. 2 GDM prevalence estimates for 2018 for jurisdictions represented in National Vital Statistics System (pink square), State Inpatient Database (green circle), and Pregnancy Risk Assessment Monitoring and Prevention's (CDC) National Center for Health Statistics' National Vital Statistics System (NVSS) birth certificate data; Agency for Healthcare Research and Quality's Healthcare Cost and Utilization Project State Inpatient Databases (SID) hospital discharge data; and CDC's Pregnancy Risk Assessment Monitoring System (PRAMS) survey data.

Table 1
). Blue error bars indicate 95% CIs for PRAMS estimates Demographic characteristics for people aged 18-39 years old who had a live birth in 22 Jurisdictions a Represented in National Vital Statistics System, State Inpatient Database, and Pregnancy Risk Assessment Monitoring System, 2018 holder.To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.