Background

Accurate determination of gestational age is important for assessing fetal development and maturity. This is necessary for investigating the impact of prenatal factors on pregnancy outcomes and any deviation from normal fetal development [1, 2]. Although gestational age at birth exhibits some normal variation, both preterm and post-term births are associated with an increased risk of adverse perinatal outcomes and health outcomes later in life [3,4,5,6,7]. The effects of gestational age at birth on health outcomes may be linked to epigenetic patterns established in utero or early in the postnatal period [8, 9]. Changes in these patterns may interfere with critical developmental processes [10,11,12] and trigger phenotypic changes that persist throughout life. This may be even more pertinent to children conceived by assisted reproductive technologies (ART), because ART procedures coincide with the extensive epigenetic reprogramming in the early embryo [13, 14].

DNA methylation (DNAm) is the most studied epigenetic mark in humans. It has, in recent years, been used to build gestational age clocks that can predict gestational age [15,16,17,18]. Earlier clocks were built using DNAm data from the Illumina HumanMethylation27 (27 K) or the Illumina HumanMethylation450 (450 K) BeadChip arrays, both of which have subsequently been replaced by the Illumina MethylationEPIC BeadChip (EPIC). EPIC has nearly twice (865,859 CpGs) as many CpGs as 450 K, and a stronger focus on regulatory elements [19]. Although EPIC includes over 90% of the probes on 450 K [19], six to eight of the CpGs included in existing gestational age clocks are not present on EPIC. This discrepancy may affect the precision of the published clocks in predicting gestational age when applied to DNAm data generated on EPIC [20]. Therefore, it is essential to develop a new gestational age clock that is updated and optimized for EPIC. Equally important is to elucidate whether the additional CpGs on EPIC enhance gestational age prediction.

A challenge in developing accurate gestational age clocks is the lack of information on the exact gestational age of the newborns. The standard approaches for estimating gestational age, based on ultrasound measurements or the last menstrual period (LMP), have thus far been used for training and testing epigenetic clocks. Ultrasound and LMP are widely used in clinical settings and have their individual advantages and limitations. While LMP can be informative, it suffers from large variability, in part due to varying length of the follicular phase. Ultrasound is much more precise but still depends on the size of the fetus at the time of ultrasound [1, 21, 22]. On the other hand, for children conceived by ART, the exact time when the embryo is transferred back to the uterus is known. Although there may be some differences in the days before fertilization and embryo transfer, and the developmental speed may differ in the in vitro setting, the embryo transfer date (ETD) provides a more direct estimate of gestational age [23]. Therefore, DNAm data from ART births is particularly advantageous for developing and validating gestational age clocks. To our knowledge, no gestational age clock has yet been developed using ETD, although its use has been called for previously [16].

In addition to gestational age prediction, gestational age clocks can be used to estimate gestational age acceleration (GAA), which is defined as the discrepancy between gestational age predicted from DNAm data and gestational age derived from clinical measurements [16, 24]. Investigating GAA is important because of its reported association with several measures related to birth outcomes, such as the cerebroplacental ratio (a robust indicator of prenatal stress [25]), higher maternal body mass index, and larger birth size [26]. Although children conceived by ART have a higher risk of spontaneous preterm birth [27] and other adverse perinatal outcomes [28,29,30], only one small study has explored GAA in ART children [31].

To address these knowledge gaps, we developed a new gestational age clock based on EPIC-derived DNAm data from newborns in the Norwegian Study of Assisted Reproductive Technologies (START), which is a substudy within the Norwegian Mother, Father and Child Cohort Study (MoBa) [32]. We validated this clock in test sets of ART and non-ART newborns in START, and also in an external dataset from the Finnish Prediction and Prevention of Preeclampsia and Intrauterine Growth Restriction (PREDO) study [33], which was used as a replication cohort. We also used the new EPIC-based clock to explore differences in GAA between ART and non-ART newborns.

Results

The EPIC gestational age clock

Table 1 and Fig. 1 provide overviews of the datasets used in this study. We fit a least absolute shrinkage and selection operator (Lasso) regression on DNAm data from 755 non-ART newborns in START. 176 CpGs were selected for being predictive of gestational age. Individual CpG sites and their corresponding coefficients are provided in Additional file 4.

Table 1 Characteristics of the datasets used to evaluate the EPIC GA clock
Fig. 1
figure 1

Analysis flow. START newborns were grouped into ART and non-ART, and each group was randomly assigned to a training and test set. The non-ART training set was used to develop the EPIC GA clock and the 450 K/EPIC overlap clock. The ART training set was used to develop the ETD-based clock. All three clocks were tested in the non-ART test set. The EPIC GA clock, the Bohlin clock, and the Knight clock were also tested in the PREDO test set. The datasets are marked in green, and the clocks are marked in blue. START-derived datasets and clocks are marked with solid lines. External datasets and clocks are marked with dashed lines

We validated the resulting predictor, referred to as “EPIC GA clock” hereafter, in a test set of 200 non-ART newborns from START. The EPIC GA clock showed an R2 of 0.713 and a median absolute deviation (MAD) of 3.59 days (Fig. 2, Table 2).

Fig. 2
figure 2

Using the EPIC GA clock to predict gestational age. Panel a shows the scatter plot of predicted gestational age against gestational age estimated by ultrasound in the training set (n = 755). Panel b shows the corresponding predicted gestational age in the test set (n = 200). The red line indicates a perfect correlation between DNAm-based gestational age and ultrasound-based gestational age. The black line indicates the MM-type robust regression of ultrasound-based gestational age on DNAm-based gestational age

Table 2 Results of gestational age prediction in START and PREDO

Comparison with previously published gestational age clocks in an external replication cohort (PREDO)

Using an external dataset of EPIC-derived DNAm data on 148 non-ART newborns from the PREDO study [33], we compared the performance of our EPIC GA clock with two published epigenetic gestational age clocks that were built on DNAm data from the previous methylation arrays: the Bohlin clock [15], based on 450 K, and the Knight clock [16], based on 27 K and 450 K. Eight CpGs in the Bohlin clock and six CpGs in the Knight clock were absent from the PREDO dataset and were thus excluded from the analysis. Compared to the Bohlin and Knight clocks, our EPIC GA clock showed higher precision and accuracy in predicting gestational age (Fig. 3, Table 3). The difference in R2 between the Bohlin clock and the EPIC GA clock was -0.062 (95% confidence interval (CI): −0.117, −0.014), and the difference in MAD was 3.27 days (95% CI: 1.87, 3.92). The corresponding statistics for the Knight clock versus our EPIC GA clock were -0.247 (95% CI: −0.342, −0.161) for R2 and 1.13 days (95% CI: 0.196, 2.40) for MAD.

Fig. 3
figure 3

Prediction of gestational age in the PREDO non-ART dataset (n = 148). Panel a shows the scatter plot of predicted gestational age against gestational age estimated by ultrasound using the EPIC GA clock. The corresponding predictions using the Bohlin clock and the Knight clock are shown in panel b and c, respectively. The red line indicates a perfect correlation between DNAm-based gestational age and ultrasound-based gestational age. The black line indicates the MM-type robust regression of ultrasound-based gestational age on DNAm-based gestational age

Table 3 Bootstrapped differences in R2, SE, and MAD between different clocks and GA estimation methods

Assessing the impact of CpGs unique to EPIC on the prediction of gestational age

Of the 176 CpGs selected in the EPIC GA clock, 89 were found exclusively on EPIC. To assess whether the additional CpGs unique to EPIC affect the prediction parameters R2 and MAD, we built a separate clock using the same training set but this time only including the 397,473 probes that are present on both 450 K and EPIC. We compared the performance of this new “450 K/EPIC overlap clock” (173 CpGs) to the EPIC GA clock (Fig. 4; Table 2) and found no significant difference in R2 (−0.0001; 95% CI: −0.021, 0.018) or MAD (0.162; 95% CI: −0.375, 0.794) (Table 3). In terms of CpG overlap, 81 CpGs in the 450 K/EPIC overlap clock were also present in the EPIC GA clock.

Fig. 4
figure 4

Prediction of gestational age using the EPIC GA, 450 K/EPIC, and ETD-based clocks. Scatter plots of predicted gestational age using (a) the EPIC GA clock, (b) the 450 K/EPIC overlap clock, and (c) the ETD-based clock against gestational age estimated by ultrasound in a test set (n = 200) of non-ART newborns from START. The red line indicates a perfect correlation between DNAm-based gestational age and ultrasound-based gestational age. The black line indicates the MM-type robust regression of ultrasound-based gestational age on DNAm-based gestational age

Using the embryo transfer date (ETD) to predict gestational age

A great advantage of the ART dataset is that the ETD is known for the ART-conceived children. We thus developed a gestational age clock using the ETD of ART-conceived children to investigate whether it was possible to achieve a better predictor of gestational age. Six hundred and seventy-four ART newborns from START (Table 1, Fig. 1) were used to train the ETD-based clock. Additional file 1: Figure S1 shows the performance of the ETD-based clock for ultrasound- and ETD-estimated gestational age in the START ART training and test set, respectively. When compared to the EPIC GA clock in the non-ART test set from START, the ETD-based clock showed a similar performance, with an R2 difference of 0.048 (95% CI: −0.041, 0.123) and a difference in MAD of 0.645 (95% CI: −0.181, 1.209) (Fig. 4; Table 3). The ETD-based GA clock contained 155 CpGs, and only 19 of them were in common with those of the EPIC GA clock.

Application of the EPIC GA clock to ART children

To assess the performance of the EPIC GA clock in ART-children, we applied the EPIC GA clock to the cord-blood DNAm data of 838 newborns conceived by ART (Table 1, Fig. 1). We compared predicted gestational age to gestational age estimated by ultrasound measurements and by ETD, respectively (Fig. 5). Gestational age estimated by ultrasound measurement and ETD was predicted with similar precision (R2 difference of 0.015 (95% CI: −0.003, 0.033); Fig. 5, Table 3) and accuracy (MAD difference of −0.102 (95% CI: −0.465, 0.174)).

Fig. 5
figure 5

Prediction of gestational age estimated by ultrasound and embryo transfer date (ETD). Scatter plots of predicted gestational age using the EPIC GA clock against gestational age estimated by a ultrasound and b ETD in a dataset of ART-born children (n = 838) in START. The red line indicates a perfect correlation between DNAm-based gestational age and a ultrasound-based or b ETD-based gestational age. The black line shows the regression of a ultrasound-based or b ETD-based gestational age on DNAm-based gestational age

Gestational age acceleration in ART children

To assess whether GAA is associated with ART, we first regressed gestational age predicted by the EPIC GA clock on gestational age estimated by ultrasound in 200 non-ART and 838 ART newborns from START. GAA was calculated using the residuals from this regression. Next, we analyzed the relationship between GAA and ART by performing a logistic regression of ART on GAA. We found no significant difference in GAA between the ART (n = 838) and non-ART (n = 200) newborns (p = 0.388, Fig. 6).

Fig. 6
figure 6

Gestational age and gestational age acceleration (GAA) in ART and non-ART children. Panel a shows predicted gestational age using the EPIC GA clock against gestational age estimated by ultrasound in ART (n = 838, highlighted in black) and non-ART (n = 200, highlighted in red) newborns from START. Panel b shows GAA represented by the regressions of EPIC GA clock-predicted gestational age on ultrasound-based gestational age in the ART and non-ART newborns

Aside from ETD, another major advantage of the START dataset is that the specific ART procedure used for conception was known, i.e., whether in vitro fertilization (IVF) was used alone or together with intracytoplasmic injection of sperm (ICSI), and whether the embryo was transferred fresh or after being frozen. We found no significant difference in GAA between newborns conceived by IVF alone (n = 470) and those conceived by IVF in combination with ICSI (n = 338) (p = 0.976, Additional file 2: Figure S2). Furthermore, there was no significant difference between fresh (n = 693) and frozen (n = 115) embryo transfer (p = 0.274, Additional file 3: Figure S3).

Gene-enrichment analysis

To explore the biological significance of the 176 CpGs selected in our EPIC GA clock, we performed gene-enrichment analyses of the genes annotated for the selected CpGs. Using the annotation data provided in Illumina’s Infinium MethylationEPIC v1.0 B4 Manifest file, we identified 154 unique gene names annotated for the 176 selected CpGs. A list of the 176 CpGs and their annotated genes is provided in Additional file 4. The software WebGestalt [34] was used to perform gene-enrichment analyses of the 154 genes [35]. WebGestalt identified 78 categories as being significantly enriched at a false discovery rate (FDR) < 0.01. The category with the highest enrichment ratio was “regulation of platelet-derived growth factor receptor signaling pathway,” containing LRP1, HIP1R, HGS, and SRC (enrichment ratio = 37; FDR = 0.003). Several of the significant hits were related to abnormal morphology of the eye, ear, nose, and other developmental categories, e.g., “plasma membrane-bounded cell projection organization” and “negative regulation of cellular biosynthetic process.” The complete output of the WebGestalt analyses is provided in Additional file 5.

Discussion

We present the first EPIC-based predictor of gestational age and demonstrate its robustness and precision in ART versus non-ART newborns. This study benefited greatly from having the largest ART dataset to date, with detailed information on ETD and the specific procedure used for conception. Our EPIC GA clock, trained on the START dataset, outperformed previous cord blood-based gestational age clocks when compared in an independent Finnish test set (PREDO).

Previous DNAm-based clocks were developed using the now outdated 27 K and 450 K. EPIC has almost twice as many CpGs as 450 K, and while 27 K and 450 K mostly cover areas around genes and CpG-islands, some of the additional probes on EPIC target distal regulatory elements and intergenic regions [36]. We, therefore, hypothesized that the additional CpGs unique to EPIC might have enhanced the performance of the EPIC GA clock. However, when we developed a separate clock featuring only those probes that are shared between 450 K and EPIC, we observed a similar performance to the EPIC GA clock, indicating that the additional CpGs on EPIC did not significantly enhance the prediction of gestational age. This observation is consistent with recent findings on age prediction by Lee et al. [37]. Another plausible explanation for the superior performance of our EPIC GA clock might be related to the fact that eight CpGs in the Bohlin clock and six CpGs in the Knight clock are absent from the EPIC array. This discrepancy might have reduced the prediction accuracy of the earlier clocks when applied to EPIC data.

A substantial advantage of the START dataset is its large sample size combined with detailed information on ETD for the ART-conceived newborns and the specific ART procedures used for conception. Using ETD provides a more direct estimate of gestational age than estimates based on ultrasound measurement or LMP [23]. We thus checked whether a clock trained on gestational age estimated by ETD would lead to a further improvement in gestational age prediction. The results showed that the two clocks had similar performance, despite the low overlap in CpGs and genes. This suggests that using ETD-based gestational age estimates for training does not significantly enhance prediction compared to clocks trained on ultrasound-based estimates, further highlighting the precision of the EPIC GA clock.

A higher risk of spontaneous preterm birth and other adverse perinatal outcomes has been reported among ART-conceived children [28,29,30]. Given that the timing of ART procedures coincides with the extensive epigenetic remodeling in the gametes and early embryo, and, further that epigenetic alterations have been reported in ART embryos and children [38,39,40], we investigated whether the epigenetic gestational age of ART newborns differed significantly from that of non-ART newborns. When we applied the EPIC GA clock to ART newborns, the precision of the gestational age prediction remained similar to that of the non-ART newborns, indicating that the clock is also well suited for predicting gestational age in ART newborns. Furthermore, the EPIC GA clock predicted both ETD-based and ultrasound-based gestational age equally well, again underscoring the precision of the clock. Finally, we found no significant differences in GAA between ART and non-ART newborns.

ART is a collective term used to describe different procedures and categories that may have different impacts on fetal DNAm. It is therefore particularly important to investigate whether gestational age prediction differs according to the specific ART procedure used. For instance, embryos may be transferred to the uterus when they are fresh or after being frozen, and IVF may or may not involve ICSI. A previous study [31] examining GAA in ICSI newborns compared to non-ART newborns did not find any significant difference between the two groups. However, the authors detected a significant decrease in DNAm-predicted gestational age at birth among the ICSI newborns. To verify these findings in our dataset, we conducted another set of analyses to explore differences between IVF, ICSI, and non-ART newborns, as well as between fresh, frozen, and non-ART-conceived newborns. We found no significant differences in DNAm-predicted GA or GAA between any of the groups (Additional file 2: Figure S2 and Additional file 3: Figure S3), further strengthening the hypothesis that GAA is not associated with ART.

Although DNAm is strongly associated with gestational age, the mechanisms underlying this association are not well understood. A closer inspection of the specific CpGs selected for gestational age prediction and the overlap between different clocks may provide some answers. Of the 176 CpGs selected by the EPIC GA clock, only 11 were in common with the CpGs in the Bohlin clock, and none overlapped with the CpGs in the Knight clock. This could partly be explained by the 89 EPIC-specific CpGs. The lack of overlap in CpGs across different clocks has also been observed in age prediction models [41]. Our analyses showed little overlap between the EPIC GA clock and the ETD-based clock, even though both were trained on EPIC data. As Lasso regression and elastic net regression may select CpGs that are not associated with the outcome per se [42], dataset-specific CpGs could end up being included in the model. Furthermore, Lasso selects one CpG for each group of correlated (or neighboring) CpGs, whereas elastic net regression selects several CpGs, leading to a so-called “grouping effect” [43], which could lead to less overlap in CpGs between prediction models.

Unraveling the biological mechanisms underlying the gestational age clocks requires identifying the genes associated with the clock-specific CpGs and examining how they are related to gestational age. Our results revealed several genes in common across the different clocks. For example, 13 genes were shared between the EPIC GA clock and the Bohlin clock, while 15 genes were shared between the EPIC GA clock and the ETD-based clock. Some of the CpGs and genes in the EPIC GA clock appear to be stably associated with gestational age. For example, CpGs linked to Nuclear Receptor Corepressor 2 (NCOR2) and Insulin-Like Growth Factor 2 MRNA-binding protein 1 (IGF2BP1) were selected in both the EPIC GA clock and the Bohlin clock, and both of these genes have previously been identified in other studies of gestational age [44,45,46,47]. NCOR2 is involved in vitamin A metabolism and lung function [48], and IGF2BP1 plays an important role in embryogenesis and carcinogenesis [49]. The EPIC GA clock also identified CpGs related to Corticotropin-Releasing Factor-Binding Protein (CRHBP), consistent with previous studies of gestational age [8, 50]. CRHBP levels rise throughout pregnancy but drop markedly when approaching term [51]. Furthermore, Mastorakos and Ilias [52] showed that CRHBP might prevent aberrant pituitary-adrenal stimulation in pregnancy. In addition to the genes mentioned here, several other genes linked to the CpGs in our clock have previously been implicated in gestational age, including Muscleblind Like Splicing Regulator 1 (MBNL1), CD82 molecule (CD82), Integrin Subunit Beta 2 (ITGB2), and Rap Guanine Nucleotide Exchange Factor 3 (RAPGEF3) [47, 50]. Additional studies are needed to elucidate their roles in gestational age.

For a clock to be useful, it needs to be generalizable to other cohorts and populations. As with the Bohlin clock, our EPIC GA clock was trained on data from a relatively homogeneous cohort in terms of ethnicity, socioeconomic status, and age [32, 53]. Our clock performed equally well in the independent Finnish PREDO cohort. However, while the use of a homogeneous training set may enhance the prediction model [42, 54], it can also result in a cohort-specific clock that is less generalizable to other populations.

Exploring associations between specific neonatal outcomes and DNAm-based gestational age is still in its nascent stages [26, 55], and there are many unanswered questions regarding neonatal development. The development of an EPIC-specific gestational age clock may offer additional insights into gestational age and neonatal development. As the 450 K array has been discontinued, we anticipate that future research on DNAm-based GA clocks will migrate to the more updated EPIC array. Research on GA-related topics and DNAm utilizing the 450 K array are expected to continue for some time, as many 450 K-based datasets are still in circulation and some are being used in consortia-led efforts. The clocks presented here may facilitate further research on DNAm-based clocks for both 450 K and EPIC-based arrays.

Conclusions

The new EPIC GA clock presented here predicted gestational age precisely in both ART and non-ART newborns and outperformed previous cord blood-based gestational age clocks when validated in an independent test set. The increased performance was not due to the higher coverage of CpGs on the EPIC array. Furthermore, the use of ETD-estimated gestational age for training did not improve the precision of gestational age prediction significantly compared with clocks trained on ultrasound-estimated gestational age. This is reassuring, as most datasets on newborns only have ultrasound- or LMP-based measures of gestational age. Finally, we did not find any significant association between GAA and ART. With a growing number of epigenetic datasets currently being generated on the EPIC platform, we expect our EPIC GA clock to become increasingly valuable in assessing developmental maturity in studies of neonatal development and disease.

Methods

Study population

MoBa is an ongoing, population-based pregnancy cohort study conducted by the Norwegian Institute of Public Health (NIPH). Totally, 114,500 children, 95,200 mothers, and 75,200 fathers were recruited from all over Norway from 1999 through 2008 [32]. The MoBa mothers consented to participation in 41% of the pregnancies. Extensive details on the MoBa cohort have been provided elsewhere [32, 56]. START is a substudy of MoBa and consists of 1,995 newborns and their parents. Blood samples from the newborns were obtained from the umbilical cord at birth [56].

PREDO is a prospective pregnancy cohort of Finnish women who gave birth to a singleton live child between 2006 and 2010 [33]. The cohort comprises 1079 pregnant women; 969 of these had one or more known risk factors for preeclampsia and intrauterine growth restriction, whereas the rest had no such risk factors. The women were enrolled in the study when they arrived for their first ultrasound screening at 12–14 gestational weeks in 10 study hospitals in Southern and Eastern Finland. Blood samples were obtained from the cord blood of 998 newborns [57]. To validate the gestational age clocks, we used cord blood-based DNAm data from 148 newborns (Fig. 1).

DNAm profiling and quality control

Cord blood samples taken by a midwife immediately after birth were frozen [56]. Five hundred nanograms of DNA extracted from the cord blood of START newborns were shipped to LIFE & BRAIN GmbH in Bonn, Germany, for measurement of DNAm on the Illumina MethylationEPIC array (Illumina, San Diego, USA). The raw iDAT files were imported and processed in four batches using the R-package RnBeads [58]. 44,210 cross-hybridizing probes [59] and approximately 10,000 probes with a high detection p-value (above 0.01) were removed. 16,117 probes with the last three bases overlapping with a single-nucleotide polymorphism (SNP) were also excluded. The remaining DNAm signal was processed using BMIQ [60] to normalize the type I and type II probe chemistries. Control probes output from RnBeads were visually inspected for all samples, and those with low overall signals were removed. The Greedycut option [58] was used to remove outliers with markedly different DNAm signals than the rest of the samples. This resulted in the removal of 58 samples in total. For consistency, CpG sites removed from one batch, due to poor quality and detection p-value, were also removed from subsequent batches. After quality control, 770,586 autosomal CpGs and 1945 samples remained in the final dataset. 1793 subjects for whom we had information on ultrasound-based gestational age were used to develop and validate the gestational age clocks in this study.

For the PREDO samples, DNA was extracted according to standard procedures. Methylation analyses were performed at the Max Planck Institute of Psychiatry in Munich, Germany. DNA samples were bisulfite-converted using the EZ-96 DNA Methylation kit (Zymo Research, Irvine, CA) and assayed on the Illumina Infinium MethylationEPIC array (Illumina, San Diego, USA). Three samples were excluded for being outliers based on their median intensity values. Another three samples showing discordant phenotypic and estimated sex were excluded. A further three samples were contaminated with maternal DNA and were also removed [61]. Methylation beta-values were normalized using the funnorm function [62] in the R-package minfi [63]. Three samples showed density artifacts after normalization and were removed from further analysis. We excluded probes on the sex chromosomes, probes containing SNPs, and cross-hybridizing probes according to previously published criteria [59, 64, 65]. Furthermore, CpGs with a detection p-value > 0.01 in at least 25% of the samples were also excluded. Finally, one duplicate sample was removed after quality control. The final dataset contained 812,987 CpGs and 148 samples. After normalization, no significant batch effects were identified.

Variables

For the START dataset, information on gestational age, sex, and ART status was extracted from the Medical Birth Registry of Norway (MBRN). Gestational age at birth was estimated by ultrasound measurements in week 18 of pregnancy. For the ART children, we used the date of egg retrieval plus 14 days to obtain a second estimate of gestational age. When the date of egg retrieval was not known, the date of embryo insertion was used instead, minus two days. For embryos that were frozen, we used the date of embryo insertion plus 14 days, and the number of days between egg retrieval and freezing. These three estimations of gestational age were combined into a variable called embryo transfer date (ETD). IVF and ICSI were defined as ART treatments, whereas children conceived by intrauterine insemination were defined as non-ART births.

For the PREDO dataset, information on gestational age and sex was extracted from the Finnish Medical Birth Register. Gestational age at birth was estimated by ultrasound measurements between 12 and 14 weeks of pregnancy.

Gestational age prediction

Figure 1 shows a flowchart of the analyses performed. Children conceived without ART (non-ART) were randomly split into two groups: a training set (~ 80%) for developing the clock and a test set (~ 20%) for validating the clock. We used Lasso regression from the R-package glmnet [66] to develop DNAm-based predictors of gestational age. Clinically estimated gestational age was regressed on the 770,586 remaining CpGs after quality control in the START dataset. For the “450 K/EPIC overlap clock,” only the 397,473 CpGs that were in common between 450 K and EPIC were used. Missing probes were imputed using the median imputation procedure in the R-package Hmisc [67]. Tuning parameters α and λ were selected after tenfold cross-validation in the training set. For the “EPIC GA clock,” Lasso regression selected 176 CpGs (α = 1, λ = 0.66), while for the 450 K/EPIC overlap clock and the “ETD-based clock,” 173 CpGs (α = 1, λ = 0.63) and 156 CpGs (α = 1, λ = 0.62) were selected, respectively. Individual CpG sites and their corresponding coefficients are provided in Additional file 4.

The above clocks were used to estimate gestational age in (i) the START non-ART test set, (ii) the START ART newborns, and (iii) the non-ART newborns from PREDO (see Fig. 1 for more details). Predicted gestational age was regressed on clinically estimated gestational age using MM-type robust linear regression [68] from the R-package robustbase [69]. The precision of a given prediction model was defined as the proportion of variance explained by the model (i.e., by the R2 value). Accuracy, on the other hand, was defined as the median absolute deviation (MAD) between observed and predicted gestational age.

Comparison of prediction parameters

To compare the performances of the different clocks and GA estimation methods, we calculated the differences in R2, SE, and MAD when computed by two different clocks or GA methods. To assess the size and significance of the differences, we computed bootstrap confidence intervals for each difference. Since all three performance measures can be calculated from observed and predicted GA values, each bootstrap sample selected individuals randomly and used the observed and predicted GA values already calculated for those individuals. The pairs of R2, SE, and MAD values were calculated from the same bootstrap sample to account for the same dataset being used in each comparison. Thus, we did not need to refit the full prediction model for each bootstrap sample.

The bootstrapping was performed using the R-package boot [70, 71]. 95% confidence intervals of the bootstrap differences were standard percentile intervals, reported as type “perc” by the boot package. A difference was considered statistically significant when the corresponding confidence intervals did not include the value 0.

Gestational age acceleration analysis

GAA was defined as the residuals from a linear regression of DNAm gestational age predicted by the EPIC GA clock on ultrasound-estimated gestational age [16]. We tested for association between GAA and ART by performing a logistic regression of ART on GAA.

Gene-enrichment analysis

The online functional enrichment software WebGestalt [34] was used to search for enrichment within the annotated genes of the EPIC GA clock. We identified 154 unique gene names annotated for the 176 CpGs selected in the EPIC GA clock using the annotation data from Illumina’s Infinium MethylationEPIC v1.0 B4 Manifest file. We then performed an overrepresentation analysis on the 154 genes using Fisher’s exact test [35], assigning a minimum of five genes per category, and using the genome as background. WebGestalt leverages data from the following databases for each category: gene ontology [72, 73] (Biological Process, Cellular Component, Molecular Function), pathway (KEGG [74], Panther [75], Reactome [76], WikiPathway [77]), network (Kinase target, Transcription Factor target, miRNA target), disease (DisGeNET [78], GLAD4U [79], OMIM [80]), drug (DrugBank [81]), phenotype (Human Phenotype Ontology [82]), and chromosomal location (Cytogenic Band). The Benjamini–Hochberg procedure was applied to the p-values, and categories with a false discovery rate below 0.01 were declared significantly enriched.