Analysis of two birth tissues provides new insights into the epigenetic landscape of neonates born preterm
- 20 Downloads
Preterm birth (PTB), defined as child birth before completion of 37 weeks of gestation, is a major challenge in perinatal health care and can bear long-term medical and financial burden. Over a million children die each year due to PTB complications, and those who survive can face developmental delays. Unfortunately, our understanding of the molecular pathways associated with PTB remains limited. There is a growing body of evidence suggesting the role of DNA methylation (DNAm) in mediating the effects of PTB on future health outcomes. Thus, epigenome-wide association studies (EWAS), where DNAm sites are examined for associations with PTB, can help shed light on the biological mechanisms linking the two.
In an Asian cohort of 1019 infants (68 preterm, 951 full term), we examined and compared the associations between PTB and genome-wide DNAm profiles using both cord tissue (n = 1019) and cord blood (n = 332) samples on Infinium HumanMethylation450 arrays. PTB was significantly associated (P < 5.8e−7) with DNAm at 296 CpGs (209 genes) in the cord blood. Over 95% of these CpGs were replicated in other PTB/gestational age EWAS conducted in (cord) blood. This replication was apparent even across populations of different ethnic origin (Asians, Caucasians, and African Americans). More than a third of these 296 CpGs were replicated in at least 4 independent studies, thereby identifying a robust set of PTB-linked epigenetic signatures in cord blood. Interrogation of cord tissue in addition to cord blood provided novel insights into the epigenetic status of the neonates born preterm. Overall, 994 CpGs (608 genes, P < 3.7e−7) associated with PTB in cord tissue, of which only 10 of these CpGs were identified in the analysis using cord blood. Genes from cord tissue showed enrichment of molecular pathways related to fetal growth and development, while those from cord blood showed enrichment of immune response pathways. A substantial number of PTB-associated CpGs from both the birth tissues were also associated with gestational age.
Our findings provide insights into the epigenetic landscape of neonates born preterm, and that its status is captured more comprehensively by interrogation of more than one neonatal tissue in tandem. Both these neonatal tissues are clinically relevant in their unique ways and require careful consideration in identification of biomarkers related to PTB and gestational age.
This birth cohort is a prospective observational study designed to study the developmental origins of health and disease, and was retrospectively registered on 1 July 2010 under the identifier NCT01174875.
KeywordsEpigenome wide association study Preterm birth Gestational age Tissue specificity DNA methylation Neonate
Epigenome-wide association study
Growing Up in Singapore Towards Healthy Outcomes
KK Women’s and Children’s Hospital, Singapore
National University Hospital, Singapore
Surrogate variable analysis
Preterm birth (PTB), defined as delivery of the offspring before completion of 37 weeks of gestation, is a major public health problem that exerts a significant disease burden globally . In 2016, World Health Organization estimated 15 million babies (at least 1 in 10 babies) to be born preterm annually, and that these numbers are rising each year . PTB is associated with developmental delays, and infants born preterm are at an increased risk of mortality from infancy to adulthood due to the onset of various chronic health problems [3, 4]. However, the biological pathways underlying the associations between PTB and future health remain elusive [5, 6]. Epigenetic mechanisms play a critical role in regulating cell lineage commitment and fetal programing and are highly sensitive to in utero perturbations. Any interference with the epigenetic settings within the cell or its developmental state can have life-long impact on the health of the offspring. Thus, epigenome-wide association studies (EWAS) related to PTB [7, 8, 9, 10, 11, 12] can help elucidate the biological mechanisms linking the two .
There is a growing body of evidence suggesting the influence of PTB on neonatal epigenome through DNA methylation (DNAm) [7, 8, 13, 14, 15, 16, 17, 18]. Earlier efforts in interrogating DNAm changes in association with PTB typically focused on candidate regions of the epigenome [19, 20] or were conducted in smaller sample sizes [7, 8, 9, 10, 18]. Recently, some research groups have conducted EWAS of gestational age (GA), with some using larger sample sizes. Schroeder et al.  reported and replicated the association between DNAm and GA at CpG sites in 25 genes, genes previously implicated in labor and delivery and adverse health outcomes. Lee et al.  reported DNAm at three regions associated with GA, regions located near genes that play key roles in fetal development (NFIX, RAPGEF2, MSRB3). Bohlin et al.  and Simpkin et al.  reported DNAm at 5474 CpG sites and 224 CpG sites to associate with GA, respectively. Though the total sample sizes in these GA EWAS were larger, with the exception of Bohlin et al. , the number of preterm infants in the analyses did not exceed 30.
While earlier studies have made significant progress in identifying DNAm perturbations associated with PTB/GA and enhanced our understanding of the epigenetic processes associated with PTB, a few important considerations remain. First, as earlier investigations were primarily conducted in Caucasian and/or African American populations, it is unclear how these findings hold in an Asian population. Second, earlier work primarily focused on examination of DNAm in infant cord blood [7, 9, 10, 11, 12, 16, 21, 22], but there have been no studies done on cord tissue. Since cord tissue and cord blood originate from different cell lineages, each tissue potentially reveals unique perspectives within the preterm scenario. Pertinently, our earlier work has demonstrated that neonate EWAS conducted using infant cord tissue can give very distinct findings from those conducted in cord blood . Hence, the two tissues together capture a better understanding of the epigenetic alterations induced by a suboptimal fetal environment. Here, we present the first EWAS of PTB conducted in an Asian cohort, where we examine and compare the associations of PTB with DNAm in both infant cord tissue and cord blood.
This study involved 1019 infants from live singleton births, of which 68 infants were born preterm (Additional file 1: Figure S1A). Summary statistics of these infants are provided in Additional file 2: Table S1. The ethnic distribution of study subjects with available cord tissue samples was 58% Chinese, 25% Malay, and 17% Indian. Fifty-three percent of the infants were male. The difference in the distributions of ethnicity (P = 0.88) and sex (P = 0.90) of the infants in preterm vs. term groups was not statistically significant. We interrogated DNAm profiles derived from infant cord tissue and cord blood using the Infinium HumanMethylation450 array. DNAm data was available for all 1019 infants for cord tissue and in a subset of infants for cord blood (332 infants, including 31 preterm infants, Additional file 2: Table S2, Additional file 1: Figure S1B). Similarly, the distributions of infants with cord blood samples in preterm vs. term groups were not significantly different with respect to ethnicity (P = 0.47) and infant sex (P = 0.58). After quality control and elimination of CpGs with low variability, 134,676 and 85,624 CpGs were retained for subsequent analyses in cord tissue and cord blood, respectively.
Cord tissue reflected extensive associations between PTB and infant DNAm
Cord blood reflected extensive associations between PTB and infant DNAm
Majority of PTB-associated CpGs in cord blood are replicated in other PTB/GA EWAS
DNA methylomes of cord blood and cord tissue respond differently to PTB
For CpGs significantly associated with PTB in at least one tissue, we also assessed whether there was evidence of tissue-dependent effects. For the 994 PTB-associated CpGs from cord tissue, 546 CpGs were removed from the cord blood dataset due to quality control filtering (426 of them were due to low inter-individual variation); for the remainder 448 CpGs, majority of the CpGs (143 at P < 1e−4, 310 at P < 0.05) showed evidence of tissue-dependent effects (Additional file 2: Table S7). Similarly for the 296 PTB-associated CpGs from cord blood, 102 CpGs were removed from the cord blood dataset due to quality control filtering (29 of them were due to low inter-individual variation); for the remainder 194 CpGs, majority of the CpGs (126 at P < 1e−4, 184 at P < 0.05) showed evidence of tissue-dependent effects (Additional file 2: Table S8).
DNAm status of genes affected by PTB in the two neonatal tissues represents distinct biological processes
Majority of PTB-associated CpGs in cord tissue and cord blood were also associated with GA
Lastly, we also examined the associations between GA and DNAm in each tissue. In this analysis, GA was modeled as a continuous variable instead of a binary variable (preterm vs. term). After adjustment for multiple testing, 4075 CpGs (P < 3.7e−7) were significantly associated with GA in cord tissue (Additional file 2: Table S11). Upon analysis using cord blood, 1916 CpGs (P < 5.8e−7) were associated with GA (Additional file 2: Table S12), 94 of these overlapped with the 4075 cord tissue GA-associated CpGs. Comparison of GA-associated vs. PTB-associated CpGs (Additional file 1: Figure S12) showed that > 95% of the 994 PTB-associated CpGs in cord tissue were also GA-associated (950 with P < 3.7e−7 and 993 with P < 1e−4 in an analysis using GA). Similarly, most of the 296 PTB-associated CpGs in cord blood remained GA-associated (284 with P < 5.8e−7 and 293 with P < 1e−4). These results suggests PTB-associated CpGs may also be a signature of GA. Gene ontology analyses performed on the 4075 cord tissue GA-associated CpGs and 1916 cord blood GA-associated CpGs gave similar conclusions as the analyses performed on PTB-associated CpGs. Specifically, cord tissue CpGs showed enrichment of pathways (Additional file 1: Figure S13) related to fetal growth and development (Additional file 1: Figure S14, Additional file 2: Table S13), while cord blood CpGs showed enrichment of immune response pathways (Additional file 1: Figure S15, Additional file 2: Table S14). We also compared these 1916 cord blood GA-associated CpGs with those reported by previous studies [7, 8, 9, 10, 11, 12]. Of the 1916 cord blood GA-associated CpGs identified in the current study, 89% (1714 CpGs) could be replicated in at least 1 of the previous studies and 60% (1141 CpGs) in at least 2 of the previous studies (Additional file 1: Figure S16, Additional file 2: Table S15). However, the replication of the 296 cord blood PTB-associated CpGs with previous studies was relatively higher as > 95% of these CpGs replicated in at least 1 of the previous studies and > 80% replicated in at least 2 other studies.
In this study, we report associations with DNAm profiles in neonates born preterm by using tissues of different germinal origins, i.e., cord tissue and cord blood. The key findings from our study include (1) the replication of PTB/GA-associated cord blood CpGs across different studies and ethnicities to identify robust epigenetic signatures of PTB, (2) the identification of DNAm associations with PTB in cord tissue, and (3) the importance of evaluating the DNA methylomes of two germinally distinct neonatal tissues to capture a more comprehensive view of the molecular pathways associated with PTB.
Replication of CpGs associated with GA/PTB in cord blood across different studies and ethnicities
More than 95% of the CpGs identified in our cord blood PTB EWAS were replicated in previous PTB/GA EWAS studies [7, 8, 9, 10, 11, 12]. In particular, cg23062810 from CLIP2 gene was replicated across six independent studies. CLIP2 gene also seems to be a hotspot for PTB/GA-associated DNAm changes, as 6 additional CpGs have been previously reported from this gene—cg16356456 [7, 8, 9, 10, 11, 12], cg04952324 [8, 9, 10, 11], cg11573518 , cg02935052 , cg21375204 , and cg19501108 . Notably, 2 CpGs adjoining cg23062810, i.e., cg16356456 and cg11573518, also showed moderate significance (P value < 10−5) in our study. The CpG trio of cg23062810, cg16356456, and cg11573518 is a promising candidate epigenetic signature for functional studies, as they are not only consistently reported to be hypermethylated in cord blood of preterm neonates, but also span a short 224-bp genomic region containing DNaseI hypersensitive site and several known transcription factor binding sites. CLIP2 is a cytoplasmic linker protein expressed in the brain , with its haploinsufficiency linked to motor coordination abnormalities . CLIP2 deletion is linked to Williams-Beuren syndrome, but deletion of a single copy alone is insufficient to result in the physical or cognitive characteristics of the disease .
Furthermore, in spite of the interrogation of PTB associations in an Asian population within our study, we achieved robust replication of 16 CpGs across all 6 earlier PTB/GA EWAS studies conducted in other populations of Caucasian/African American origin. These 16 CpGs span 12 genes, with 4 of these genes containing at least 2 PTB-associated CpGs in the current study. These genes include interleukin 21 receptor (IL21R), a key component of the adaptive immune system ; NCOR2, a relatively ubiquitously expressed repressor linked to a wide variety of biological processes including metabolism, inflammation, and circadian rhythm ; proline-rich 5 like (PRR5L), involved in the cellular response to oxidative stress ; and insulin-like growth factor 2 mRNA-binding protein 1 (IGF2BP1), a tightly regulated cell proliferation protein highly expressed during embryogenesis [32, 33]. Notably, the PRR5L gene carries 10 previously reported GA/PTB-associated CpGs, 3 of which we found to be PTB-associated in the current study (cg08943494, cg00220721, cg22117805). Although the exact function of PRR5L with respect to pregnancy is unknown, PRR5L suppresses a key regulator of cellular mTORC2 in vitro, which in turn is regulated by lysophosphatidic acid (LPA) and Gα12 activity . LPA is implicated in the maintenance of pregnancy , uterine contractility , and infection-related preterm labor ; while Gα12 is a molecular regulator of extracellular stimuli, including oxidative stress . There is also emerging evidence that mTOR-related genes are differentially expressed between term and preterm labor as well as between labor and non-labor myometrial .
Identification of associations between DNAm and PTB in cord tissue
In addition to the findings from cord blood, we identified 994 CpGs to significantly associate with PTB in cord tissue, of which only 10 CpGs overlapped with cord blood CpGs. Our cord tissue findings provide new insights into the epigenetic landscape of neonates born preterm as this birth tissue has not been explored in this context before. Most importantly, the analysis of two neonatal tissues representing different cell type lineages provides a wider coverage of biological processes associated with PTB.
Combination of EWAS in two neonatal tissues captures a comprehensive view of the molecular pathways associated with PTB
The two neonatal tissues gave deeper insights into the plausible molecular pathways associated with PTB. Gene networks in cord blood indicated the role of inflammation in PTB, which is in agreement with the previous findings implicating the role of inflammation in the etiology of PTB [6, 40]. The top most statistically significant PTB-CpGs from cord blood were found in genes involved in inflammation such as TRAF5, a key regulator of both canonical (via TNFα ) and non-canonical (via lymphotoxins ) NF-kappaB activation, and MYLK, a relatively ubiquitously expressed gene implicated in several inflammatory diseases  and also the main target for oxytocin-induced phosphorylation, downregulation of which follows uterine contraction at term . Immune-related genes that were highly reproduced across different studies include NCOR2, an integral corepressor within the Notch signaling  with links to NK-kappaB-mediated apoptosis ; zinc finger and BTB domain containing 7B (ZBTB7B), a key regulator of CD4+T cell commitment ; PDZ and LIM domain protein 2 (PDLIM2), a key inhibitor of inflammatory response through NF-kappaB ; and IL2RA, a key component in immunological function primarily through the establishment of T cell immunological memory .
Gene ontology terms linked with cord blood CpGs also reflected the dominance of immune-related biological processes despite the adjustment for cellular heterogeneity. The largest gene ontology cluster enriched in the pathway analysis was regulation of T cell differentiation, a hallmark of innate immune system development, which includes genes such as tripartite motif containing 22 (TRIM22, interferon signaling ), interleukin 1 receptor-associated kinase 2 (IRAK2, inflammatory response to infection ), and caspase recruitment domain family member 11 (CARD11, critical component of T cell and B cell signaling ). The next two largest clusters also featured several gene ontology terms with various immune-related nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kappaB) components.
The role of immune-related genes is also apparent, albeit to a smaller degree, in cord tissue CpGs significantly associated with PTB. Prominent examples include NKFBIA, which binds to the nuclear localization signal of the inflammatory response element NF-kappa-B/REL complex, preventing transcription and inflammatory response ; and NFIL3 (nuclear factor, interleukin 3 activated), a transcription regulator, mostly inhibiting many genes , but also known to activate interleukin-3 , mediating pro-B lymphocyte survival . Incidentally, NFKBIA appears to be upregulated in placenta with history of chorioamnionitis, as well as those complicated by preterm premature rupture of membrane (PPROM) cases . Upregulation of NFKBIA is suggested to be a form of anti-inflammatory response to inflammatory insults . NFIL3, on the other hand, is downregulated with term within CD34+ cord blood fractions , consistent with the comparatively larger population of immature hematopoietic progenitor population in preterm cord blood. Collectively, differentially methylated CpGs from both tissues highlight the role of inflammatory genes in PTB, with a larger representation in cord blood than cord tissue.
Cord tissue CpGs significantly associated with PTB were found in genes with more diverse gene functions as opposed to primarily immune responses seen in cord blood. These included general transcription factor genes such as protein C-Ets-2 (ETS2)  and specificity protein 1 (SP1) . Incidentally, ETS2 was previously reported to be downregulated in preterm placentas with spontaneous labor , while differentially expressed genes with respect to peripheral blood in mothers who delivered preterm possessed over-representation of SP1 binding sites within their promoters . Gene ontology enrichment analysis revealed cord tissue CpGs to mostly lie in genes related to physiological growth and development. In particular, bone development was the largest grouped cord tissue gene ontology result, including genes such as parathyroid hormone 1 receptor (PTH1R, surface receptor of osteoblasts ), bone morphogenetic proteins 2 and 6 (BMP2, BMP6, simulator of bone growth ), and matrix metallopeptidase 7 (MMP7, associated with bone remodeling ). This was followed by regulation of Wnt signaling pathway—a pathway which plays a central role in embryonic development , with members such as Wnt family members 8A and 11 (WNT8A, involved in axis patterning ; WNT11 is involved in the skeletal, kidney, and lung development ). The third largest gene cluster was related to extracellular matrix (ECM) organization. ECM impacts a number of cellular functions critical for normal fetal development and morphogenesis . The most familiar developmental function attributed to ECM is cell migration during fetal development and organogenesis that is facilitated by cycles of cell adhesion and deadhesion. ECM also plays important structural roles in defining tissue boundaries, branching morphogenesis, developing tissue asymmetry and growth factor signaling.
This study has a few limitations. First, while we have observed global DNAm alterations in the two neonatal tissues at CpG sites assayed using the Infinium HumanMethylation450 platform, the CpGs assayed by the platform were not randomly selected from the DNA methylome. Consequently, it is unclear how the findings will extend to the rest of the DNA methylome. Second, as supported by our findings here and in a previous publication , EWAS conducted using different tissues can give very distinct findings. While use of clinically available tissues like cord tissue and cord blood is convenient, use of these two tissues may not completely mirror the effects of PTB in target tissues. Thus, further research is necessary to investigate if these findings can be extrapolated to the relevant tissues of interest. Third, while we have successfully replicated our findings in cord blood using results from published literature, due to the lack of availability of cord tissue DNAm data, we are unable to replicate the cord tissue findings in an independent cohort. However, the robust (> 95% CpGs) replication of the findings in previously reported PTB studies in cord blood and the use of larger sample size suggest that our novel cord tissue findings are likely to be robust too.
Using DNAm profiles from two different neonatal tissues (cord tissue and cord blood), we provide the epigenetic status of a broader spectrum of molecular pathways associated with PTB. Our findings suggest that genes involved in inflammation and fetal developmental processes play a key role in PTB. Further research is necessary to identify the specific role played by these epigenetic changes on the postnatal developmental and health trajectories of the offspring.
Between June 2009 and September 2010, healthy pregnant women were recruited in their first trimester of pregnancy from two major public hospitals in Singapore, namely the KK Women’s and Children’s Hospital (KKH) and the National University Hospital (NUH), to participate in the Growing Up in Singapore Towards Healthy Outcomes (GUSTO) birth cohort study . To participate in the study, pregnant women had to satisfy the following inclusion criteria: (1) be of at least 18 years of age; (2) hold Singapore citizenship or permanent residency, or intent to reside in Singapore for the next 5 years; (3) be of Chinese, Malay, or Indian ethnic origin, confirmed through homogeneous parental ethnic background and genotyping; (4) intent to deliver at either NUH or KKH; and (5) intent to donate cord tissue and cord blood. The exclusion criteria included (1) women on chemotherapy, (2) women with significant health conditions such as type 1 diabetes mellitus and psychosis, and (3) women on specific medications such as psychotropic drugs. The present analysis was restricted to live singleton births with infant DNAm data (cord tissue or cord blood).
Determining GA, infant sex, and ethnicity
GA was determined by ultrasonography in the first trimester of pregnancy. PTB was defined as GA < 37 weeks. Child sex was extracted from the medical records. Ethnicity was self-reported by the mother at study recruitment.
Tissue collection and processing
Detailed information on cord tissue and cord blood collection as well as processing has been previously described . Briefly, cord blood was collected post-delivery by either dripping the blood in EDTA tubes for normal deliveries or collecting via a syringe in the event of assisted deliveries. Collected cord blood was centrifuged at 4 °C, 3000g for 5 min, and the buffy coat extracted was stored at − 80 °C until subsequent DNA extraction. DNA extraction of cord blood was carried out using QIAsymphony DNA Kit as per the manufacturer’s instructions. After collection of the cord blood, cord tissue was cleaned with phosphate buffer saline (PBS) solution. The cord was then snap-frozen in liquid nitrogen and stored at − 80 °C until subsequent DNA extraction. Before DNA extraction, frozen umbilical cords were crushed using a mortar and pestle, treated with 10 U/mL hydraluronidase enzyme and homogenized using a Xiril Dispomix Homogeniser. Proteinase K was added to the homogenate and incubated overnight at 55 °C. Cord tissue DNA was then extracted as described earlier .
DNAm profiling and data processing
DNA methylomes for cord tissue and cord blood were profiled and processed separately using the Infinium HumanMethylation450 platform (Additional files 3 and 4). Data processing was conducted using an in-house quality control procedure that was previously described . Briefly, we exported raw DNAm beta values from GenomeStudio™ and set probes with less than three beads for either the methylated or unmethylated channel or with detection P value > 0.01 to missing. We then performed color adjustment and normalization of the type 1 and 2 probes and excluded sex chromosome probes. As part of the study design for DNAm profiling, samples were randomized across chip and position on chip with respect to key variables including GA, infant sex, and ethnicity. Thus, expectedly, PTB did not associate with chip or position effects. For both tissues, a principal component analysis of the raw DNAm revealed chip to associate most significantly with the raw DNAm data. DNAm data for both tissues were thus adjusted for chip using COMBAT, removing CpGs with missing values across all 12 positions on any chip . For the remainder technical variables that were associated with top principal components of the DNAm data, but were not randomized, PTB was associated with bisulfite conversion batch (both cord tissue and cord blood) and DNA extraction batch (cord tissue only), and these variables were included as covariates in all regression models. Finally, cross-hybridizing probes [73, 74], CpGs on or within a single-base extension of a SNP and CpGs with multi-modal distributions were excluded from the analysis. As CpGs with low inter-individual variation in each tissue may be more reflective of the technical variation than true biological signal, to reduce false positives and increase overall study power [75, 76], we further excluded CpGs that had low inter-individual variation in each tissue (i.e., DNAm range under 10% or DNAm of the 99th centile minus 1st centile under 5%). After quality control and exclusion of CpGs with low variability, 134,676 CpGs (cord tissue) and 85,624 CpGs (cord blood) were available for subsequent analysis. For infant cord tissue, cellular proportions for stromal, endothelial, epithelial, and blood were estimated using a reference panel  and their principal components were adjusted as covariates in all regression models. Likewise, for infant cord blood, cell-type proportions for granulocytes, monocytes, natural killer cells, B cells, CD4+ T cells and CD8+ T cells were estimated using a reference panel  and their principal components were adjusted as covariates in all regression models. CpGs were annotated with respect to gene features (promoter, 5′-UTR, exon, intron, 3′-UTR, TTS, and intergenic regions) using Homer annotatePeaks function (hg19).
Association between DNAm and PTB
To examine the association between DNAm and PTB for each tissue, we fitted a linear regression model with DNAm as the dependent variable and PTB as the independent variable, adjusted for technical variables that associated with PTB (bisulfite conversion batch and DNA extraction batch), infant sex, ethnicity, and estimated cell-type proportions. Infant sex and ethnicity were selected as covariates for inclusion in the regression models based on a priori evidence of their playing key roles in DNAm and/or PTB. For each CpG, individuals with outlier DNAm values (defined as DNAm values exceeding the cohort median ± twice the interquartile range for each CpG) were excluded from the analysis. PTB was coded as a binary variable, with 1 = term and 0 = preterm; thus, a negative regression coefficient implies that DNAm levels were generally higher among the preterm infants compared to term infants.
For CpGs significantly associated with PTB in at least one tissue, we also assessed whether there was evidence of tissue-dependent effects. This analysis was performed by fitting a general linear model with an unstructured covariance structure to a combined dataset with DNAm data from both tissues, including main effect terms for tissue and PTB, and an interaction term between PTB and tissue and other covariates. The interaction term between PTB and tissue provides an estimate of the difference in PTB-DNAm association in the two tissues, and a statistical test of this interaction term provides a formal test of tissue-dependent effects.
For genes where the CpGs were significantly associated with PTB after adjustment for multiple testing using a Bonferroni correction, we further examined them for enrichment of gene ontology biological pathways using the gometh function in the MissMethyl R package , which maps CpG sites to their nearest gene and corrects for bias due to non-uniform coverage of genes on the Infinium HumanMethylation450 array. To consolidate and summarize the pathway enrichment analysis results from gometh, nominally significant GO terms (P < 0.01) within the “biological processes” category were further run through the REVIGO tool, which avoids reporting GO terms with greater than 70% in semantic similarity measure . As GO terms involving many genes may not inform precise gene functionalities, larger GO terms (containing 300 or more genes) were removed before running REVIGO. The results from REVIGO were visualized using TreeMaps.
We also conducted sensitivity analyses where we further adjusted for mode of delivery, maternal hypertension, maternal age, smoking, parity, and position on chip (sensitivity analysis 1). To further allow for the possibility of unmeasured technical artifacts or un-accounted cell-type proportions, we also used surrogate variable analysis (SVA) to directly estimate sources of batch effects and/or cell-type composition from the DNAm data. The resulting estimated surrogate variables from the SVA could potentially capture both batch effects and cell-type composition. We conducted additional sensitivity analyses (sensitivity analysis 2), where we repeated the association analyses between PTB and DNAm, adjusting for surrogate variables from the SVA, on top of infant sex and ethnicity [80, 81].
Comparison of PTB-associated CpGs in cord blood with previously published studies
We compared our cord blood PTB EWAS findings with PTB/GA EWAS findings from previous studies [7, 8, 9, 10, 11, 12]. For a fair comparison, we restricted this analysis to the studies conducted using the same Infinium HumanMethylation450 platform. We also applied the DNAm GA clocks published by Knight et al.  and Bohlin et al.  to predict GA in our study samples. The clocks published by Knight et al. and Bohlin et al. were applied to our cord tissue and cord blood DNAm data separately. For this analysis, raw DNAm data without any processing or quality control filtering was used.
Associations between DNAm and GA
Since a number of previous EWAS were conducted using GA as a continuous variable instead of PTB as a binary variable, we also conducted an additional analysis using GA as a continuous variable. For each tissue, we fitted a linear regression model with DNAm as the dependent variable and GA as the independent variable, adjusted for the same covariates as before. Pathway analysis and comparison with earlier reports were performed similarly.
The GUSTO study group includes Pratibha Agarwal, Arijit Biswas, Choon Looi Bong, Birit F.P. Broekman, Shirong Cai, Jerry Kok Yen Chan, Yiong Huak Chan, Cornelia Yin Ing Chee, Helen Chen, Yin Bun Cheung, Amutha Chinnadurai, Chai Kiat Chng, Mary Foong-Fong Chong, Yap-Seng Chong, Shang Chee Chong, Mei Chien Chua, Doris Fok, Marielle V. Fortier, Peter D. Gluckman, Keith M. Godfrey, Anne Eng Neo Goh, Yam Thiam Daniel Goh, Joshua J. Gooley, Wee Meng Han, Mark Hanson, Christiani Jeyakumar Henry, Joanna D. Holbrook, Chin-Ying Hsu, Neerja Karnani, Jeevesh Kapur, Kenneth Kwek, Ivy Yee-Man Lau, Bee Wah Lee, Yung Seng Lee, Ngee Lek, Sok Bee Lim, Iliana Magiati, Lourdes Mary Daniel, Michael Meaney, Cheryl Ngo, Krishnamoorthy Niduvaje, Wei Wei Pang, Anqi Qiu, Boon Long Quah, Victor Samuel Rajadurai, Mary Rauff, Salome A. Rebello, Jenny L. Richmond, Anne Rifkin-Graboi, Seang-Mei Saw, Lynette Pei-Chi Shek, Allan Sheppard, Borys Shuter, Leher Singh, Shu-E Soh, Walter Stunkel, Lin Lin Su, Kok Hian Tan, Oon Hoe Teoh, Mya Thway Tint, Hugo P S van Bever, Rob M. van Dam, Inez Bik Yun Wong, P. C. Wong, Fabian Yap, and George Seow Heong Yeo.
This work was supported by the Translational Clinical Research (TCR) Flagship Program on Developmental Pathways to Metabolic Disease funded by the National Research Foundation (NRF) and administered by the National Medical Research Council (NMRC), Singapore—NMRC/TCR/004-NUS/2008. Additional funding is provided by Strategic Positioning Fund (SPF) awarded by Agency for Science, Technology and Research (A*STAR), Singapore, available to NK. XL is supported by Duke-NUS block fund (R-913-200-127-263) and Ministry of Education, Singapore Academic Research grant Tier 2 (MOE2018-T2-1-046).
Availability of data and materials
DNAm datasets used in this study have been included as supplementary files. Data related to preterm births are not publicly available due to ethical restrictions but can be obtained from the authors upon reasonable request and subject to appropriate approvals from the GUSTO cohort’s Executive Committee.
YW, XL, IYL, LC, and AT performed the data analysis. YW, XL, IYL, and NK interpreted the results and wrote the manuscript. YSC, PDG, and KHT were responsible for the conception and recruitment of the GUSTO cohort. JLM and MSK generated the Infinium 450K methylation data. NK supervised the study. All the authors critically revised the manuscript for intellectual and scientific content and approved the final manuscript.
Ethics approval and consent to participate
Written informed consent was obtained from all women who participated in the study. Approval for the study was granted by the ethics boards of both KK Women’s and Children’s Hospital (KKH) and National University Hospital (NUH), which are the Centralised Institute Review Board (CIRB) and the Domain Specific Review Board (DSRB) respectively.
Consent for publication
YSC, PDG, and NK have received reimbursement for speaking at conferences sponsored by companies selling nutritional products. They are part of an academic consortium that has received research funding from Abbott Nutrition, Nestec and Danone. The other authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 1.March of Dimes, PMNCH, Save the Children, WHO. Born too soon: the global action report on preterm birth. Geneva: World Health Organization; 2012.Google Scholar
- 8.Cruickshank MN, Oshlack A, Theda C, Davis PG, Martino D, Sheehan P, Dai Y, Saffery R, Doyle LW, Craig JM. Analysis of epigenetic changes in survivors of preterm birth reveals the effect of gestational age and evidence for a long term legacy. Genome Med. 2013;5(10):96.PubMedPubMedCentralGoogle Scholar
- 15.Burris HH, Braun JM, Byun HM, Tarantini L, Mercado A, Wright RJ, Schnaas L, Baccarelli AA, Wright RO, Tellez-Rojo MM. Association between birth weight and DNA methylation of IGF2, glucocorticoid receptor and repetitive elements LINE-1 and Alu. Epigenomics. 2013;5(3):271–81.PubMedPubMedCentralGoogle Scholar
- 18.Sparrow S, Manning JR, Cartier J, Anblagan D, Bastin ME, Piyasena C, Pataky R, Moore EJ, Semple SI, Wilkinson AG, et al. Epigenomic profiling of preterm infants reveals DNA methylation differences at sites associated with neural function. Transl Psychiatry. 2016;6:e716.PubMedPubMedCentralGoogle Scholar
- 26.Hoogenraad CC, Eussen BH, Langeveld A, van Haperen R, Winterberg S, Wouters CH, Grosveld F, De Zeeuw CI, Galjart N. The murine CYLN2 gene: genomic organization, chromosome localization, and comparison to the human gene that is located within the 7q11.23 Williams syndrome critical region. Genomics. 1998;53(3):348–58.PubMedGoogle Scholar
- 27.van Hagen JM, van der Geest JN, van der Giessen RS, Lagers-van Haselen GC, Eussen HJ, Gille JJ, Govaerts LC, Wouters CH, de Coo IF, Hoogenraad CC, et al. Contribution of CYLN2 and GTF2IRD1 to neurological and cognitive symptoms in Williams syndrome. Neurobiol Dis. 2007;26(1):112–24.PubMedGoogle Scholar
- 39.Foster HA, Davies J, Pink RC, Turkcigdem S, Goumenou A, Carter DR, Saunders NJ, Thomas P, Karteris E. The human myometrium differentially expresses mTOR signalling components before and during pregnancy: evidence for regulation by progesterone. J Steroid Biochem Mol Biol. 2014;139:166–72.PubMedGoogle Scholar
- 54.Keniry M, Dearth RK, Persans M, Parsons R. New frontiers for the NFIL3 bZIP transcription factor in cancer, metabolism and beyond. Discoveries (Craiova). 2014;2(2):e15.Google Scholar
- 60.Zhao C, Meng A. Sp1-like transcription factors are regulators of embryonic development in vertebrates. Develop Growth Differ. 2005;47(4):201–11.Google Scholar
- 63.Mannstadt M, Juppner H, Gardella TJ. Receptors for PTH and PTHrP: their biological importance and functional properties. Am J Phys. 1999;277(5 Pt 2):F665–75.Google Scholar
- 64.Urist MR. Bone: formation by autoinduction. Science. 1965;150(3698):893–9.Google Scholar
- 77.Lin X, Tan JYL, Teh AL, Lim IY, Liew SJ, MacIsaac JL, Chong YS, Gluckman PD, Kobor MS, Cheong CY, et al. Cell type-specific DNA methylation in neonatal cord tissue and cord blood: a 850K-reference panel and comparison of cell types. Epigenetics. 2018, in press. https://doi.org/10.1080/15592294.2018.1522929.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.