Genome-wide association and Mendelian randomisation analysis among 30,699 Chinese pregnant women identifies novel genetic and molecular risk factors for gestational diabetes and glycaemic traits

Aims/hypothesis Gestational diabetes mellitus (GDM) is the most common disorder in pregnancy; however, its underlying causes remain obscure. This study aimed to investigate the genetic and molecular risk factors contributing to GDM and glycaemic traits. Methods We collected non-invasive prenatal test (NIPT) sequencing data along with four glycaemic and 55 biochemical measurements from 30,699 pregnant women during a 2 year period at Shenzhen Baoan Women’s and Children’s Hospital in China. Genome-wide association studies (GWAS) were conducted between genotypes derived from NIPTs and GDM diagnosis, baseline glycaemic levels and glycaemic levels after glucose challenges. In total, 3317 women were diagnosed with GDM, while 19,565 served as control participants. The results were replicated using two independent cohorts. Additionally, we performed one-sample Mendelian randomisation to explore potential causal associations between the 55 biochemical measurements and risk of GDM and glycaemic levels. Results We identified four genetic loci significantly associated with GDM susceptibility. Among these, MTNR1B exhibited the highest significance (rs10830963-G, OR [95% CI] 1.57 [1.45, 1.70], p=4.42×10–29), although its effect on type 2 diabetes was modest. Furthermore, we found 31 genetic loci, including 14 novel loci, that were significantly associated with the four glycaemic traits. The replication rates of these associations with GDM, fasting plasma glucose levels and 0 h, 1 h and 2 h OGTT glucose levels were 4 out of 4, 6 out of 9, 10 out of 11, 5 out of 7 and 4 out of 4, respectively. Mendelian randomisation analysis suggested that a genetically regulated higher lymphocytes percentage and lower white blood cell count, neutrophil percentage and absolute neutrophil count were associated with elevated glucose levels and an increased risk of GDM. Conclusions/interpretation Our findings provide new insights into the genetic basis of GDM and glycaemic traits during pregnancy in an East Asian population and highlight the potential role of inflammatory pathways in the aetiology of GDM and variations in glycaemic levels. Data availability Summary statistics for GDM; fasting plasma glucose; 0 h, 1 h and 2h OGTT; and the 55 biomarkers are available in the GWAS Atlas (study accession no.: GVP000001, https://ngdc.cncb.ac.cn/gwas/browse/GVP000001). Graphical Abstract Supplementary Information The online version of this article (10.1007/s00125-023-06065-5) contains peer-reviewed but unedited supplementary material.


ESM Methods The NIPT PLUS cohort
In addition to the previously mentioned participants, we collected data from 4,688 individuals who sought maternity check-ups at Shenzhen Baoan Women's and Children's Hospital (Shenzhen, China) throughout the entire 40-week gestational period.These individuals underwent NIPT in either the first or second trimester between 2020 and 2021.Notably, these samples are characterized by a deeper sequencing depth in comparison to conventional NIPT, with an average depth of 0.3x.For this study, we leveraged their recorded data on physical measurement, blood glucose levels, lymphocyte percentage, white blood cell count, neutrophil percentage, absolute neutrophil count, and clinical diagnostic information related to GDM during pregnancy.

Statistical test of different genetic effects between two GWAS
In this study, we compared the GWAS results with those from MAGIC and between the baseline glucose and the challenged glucose levels.We examined the difference in genetic effects between two GWAS with a two-sided two-sample t-test with the following hypotheses: The T statistic was computed as follows: The degrees of freedom  ′ was determined by the formula: Herein,  1 and  2 represent the genetic effects associated with specific traits (for example, baseline glucose levels) and other traits (for example, challgend glucose levels), respectively. 1 2 and  2 2 denote sample variance, whereas  1 and  2 stand for estimated standard errors.It is established that the T statistic in equation ( 1) follows a t-distribution with a degree of freedom  ′ .To address potential inequality between  1 2 and  2 2 , the adjusted  ′ was employed, computed according to formula (2).
For all the GWAS loci we conducted, we used the P-value threshold (P < 0.05) to define statistical significance and reported the specific genetic locus. 1 Geographic distribution of the participants in this study.

Province
Administrative ESM Table 13 The power of Mendelian Randomization(MR) analyses.

ESM Fig. 4
Comparison of the genome-wide association analysis for FPG and OGTT2H between MAGIC East Asian and our study.For panel a and c, results from our study, newly identified loci are denoted by red signals, while loci with established knowledge are represented in black.For panel B and D, the loci were know and came from MAGIC East Asian OGTT2H and FPG GWAS summary, which were downloaded from https://magicinvestigators.org/downloads/.ESM Fig. 6 Locuszoom plot of genome-wide significant loci associated with the seven traits investigated in the study.For all the 35 lead SNPs in ESM Table 8, a regional plot showing the P-value and the LD r 2 of SNPs located in the upstream and downstream 500kbp flanking region are demonstrated using the Locuszoom software.ESM Fig. 9 Effects of GDM and glycemic traits on the biomarkers by Mendelian randomization.The analysis was conducted between all the 55 biomarkers collected in the pregnancy screening and the GDM occurrence as well as the 4 quantitative glycemic traits.The definition of significance was the same as ESM Fig.7, bonferroni correction α=0.05/55/5.Only biomarkers that demosntrate statistical significance after Bonferroni correction in at least one of the exposure-outcome analyses were included in the plot while results for all the biomarkers were presented in ESM Fig.11.ESM Fig. 11 Scatter plot of mendelian randomization analysis results with 4 biomarkers as the exposure and their effect on GDM with PLUS cohort meta data.Panel a-d presents the results of the mendelian randomization analyses using absolute neutrophils, neutrophil percentage, white blood cell and lymphocyte percentage as exposure respectively.The effect size and P value of inverse variance weighted showed in figure and the slope of the regression line in each panel indicates the direction of the effect of the exposure on GDM, with a positive slope representing a positive effect and a negative slope indicating a negative effect.

ESM Table 7 Replication of the lead SNPs in the NIPT PLUS, BIGCS cohort and meta-analysis. Discovery study NIPT PLUS BIGCS BIGCS and PLUS meta results Discovery versus BIGCS and PLUS meta results
P: P value of GWAS or meta analysis, values less than 0.05 are bolded.Pdiff: P value for compared the effect estimates from our primary study (the discovery cohort) with those obtained from the BIGCS and PLUS meta-analysis Phet: P value for heterogeneity in SNP effects through meta-analysis, values greater than 0.05 are bolded.

ESM Table 8 The information of Lead SNPs about the five traits investigated in the study.
Figure3.CHR: chromosome; BP(GRCh38):position, the version of position is on human GRCh38; A1: the effect allele; FRQ: the frequency of effect allele; INFO indicates the information score in the imputation process; BETA refers to the effect size in the gwas regression model; The 'Marker' column indicates whether the SNP was reported in the previous study, reported as known, has not been reported as novel.The 'REGION' indicates the SNP function region on chromosome, 'EFFECT' column indicates the biological effect of SNP.

ESM Table 9 Comparison of the 13 significnat loci for FPG and OGTT2H in the MAGIC consortium East Asian population.
CHR: chromosome; BP(GRCh38)):position, the version of position is human GRCh38; EAF: effect allele frequency.

ESM Table 10 Detection specific SNP genetic effects of FPG lead SNPs with OGTT1H and OGTT2H
Effect_detection _PFPG and OGTT1H: two-sample t-test P-value of difference in specific SNP genetic effects between FPG and OGTT1H GWAS; Effect_detection _PFPG and OGTT2H: two-sample t-test P-value of difference in specific SNP genetic effects between FPG and OGTT2H.A P-value less than 0.05 suggests a statistically significant difference in the effect of the Single Nucleotide Polymorphism (SNP) .Details of the two-sample t-test can be referred to Supplementary Notes.

ESM Table 14 MR results of GDM and 4 quantitative glycemic traits and 4 biomarkers use meta data with PLUS cohort.
Method_het and Q_pval refer to the model of heterogeneity test and p value of statistic test; egger_intercept, pleio_se,pleio_p refer to the intercept,se and P value of pleiotropy test.