Background

Cigarette smoking has severe adverse health consequences in adults and in the offspring of mothers who smoke during pregnancy [1]. Despite the efforts of advisory boards to inform women of the risk to the developing fetus, around 10% of mothers continue to smoke during pregnancy [https://www.cdc.gov/prams/]. Previous research has identified associations between maternal smoking during pregnancy and a range of health problems in the offspring [2, 3]. One of the most widely reported effects is low birth weight [4,5,6]. Low birth weight has in turn been associated with various long-term health problems in adulthood. These include increased vulnerability to stress [7], cognitive deficits [8], and chronic somatic disorders, such as cardiovascular disease [9, 10], and obesity [11]. Low birth weight has also been associated with psychiatric disorders, such as schizophrenia and depression [12, 13]. On the basis of such research, previous authors have proposed that chronic disease in adulthood may be initiated during pregnancy as a result of exposure to adverse intrauterine conditions. According to the theory of “fetal programming”, alterations in fetal nutrition and endocrine status result in developmental adaptations, which cause permanent changes in cellular structure, physiology, and metabolism, thereby predisposing to disease in adult life [14, 15].

A plausible mechanism through which exposure to adverse intrauterine conditions may negatively impact longterm health is DNA methylation. Candidate CpG investigations and epigenome-wide association studies (EWAS) have found consistent associations between smoking and differential DNA methylation in adults [16,17,18]. A number of studies have also investigated the impact of maternal smoking during pregnancy on methylation patterns in the offspring. These investigations have comprised candidate gene and genome wide approaches [19,20,21,22,23], as well as EWAS [24,25,26,27,28,29,30,31]. In these EWAS, a number of differentially methylated genes have shown repeated and consistent association [32,33,34]. The top five genes are AHRR, GFI1, MYO1G, CYP1A1, and CNTNAP2.

Previous authors have formulated the hypothesis that maternal smoking impacts birth weight via smoking-induced DNA methylation. According to this theory, differential methylation functions as a mediator, i.e., a variable that accounts for part of the relation between the predictor and the criterion [35, 36]. Initial analyses to determine whether DNA methylation mediates between maternal smoking and birth weight have already been performed. In these investigations, a mediation effect was found for eight CpGs in an analysis of infant cord blood [37], and two CpGs in an analysis of the placenta [38].

The aims of the present study were to replicate the previously reported mediating effects of methylation on the association between maternal smoking and birth weight, and for the first time to investigate whether the observed mediation effects are sex-specific. The analyses focused on high confidence smoking-related sites, and were restricted to CpG sites with: (i) significant association with smoking in a large recent meta-analysis [31]; or (ii) a reported mediating effect in the association between maternal smoking and birth weight [37, 38].

Analyses were also performed to determine whether the observed mediation effects were sex-specific, since: (i) methylation levels show substantial sex differences [39]; (ii) the observed sex differences are present from birth [40], and (iii) differing methylation patterns might have different outcomes.

Methods

Sample description

Data for the present study were derived from the POSEIDON study ("Pre-, peri- and postnatal Stress in human and non-human offspring: A translational approach to study Epigenetic Impact on DepressiON"). A detailed description of the POSEIDON study is provided elsewhere [41]. The study protocol was approved by the Ethics Committee of the Medical Faculty Mannheim of the University of Heidelberg, and the study was conducted in accordance with the Declaration of Helsinki. Data from 410 mothers and 405 infants recruited during the third trimester of pregnancy from hospitals in the Rhine-Neckar Region of Germany were analyzed. Briefly, maternal exclusion criteria for the present analyses were: (i) a history of hepatitis B, hepatitis C, or HIV-infection; (ii) any current or previous diagnosis of schizophrenia or any substance dependency other than nicotine; and (iii) any current psychiatric disorder requiring inpatient treatment. Exclusion criteria in the offspring were: birth weight <1.500 g; gestational age at delivery <32 weeks; multiple birth; and the presence of a congenital disease, malformation, deformation, or chromosomal abnormality.

A structured interview and a questionnaire battery were used to collect information concerning environmental-, sociodemographic-, medical-, and psychosocial risk factors for stress. A detailed description of these instruments is provided elsewhere [42]. Sociodemographic data are presented in the supplement (Additional file 1: Table S1). For the purposes of the present analyses, the women were divided into two subgroups: smokers and non-smokers. The smoking group comprised women who had smoked throughout pregnancy. The non-smoking group comprised women who had either not smoked at all during the pregnancy, or who had smoked during early pregnancy only. This approach was taken since previous studies as well as our data have revealed no differences in the methylation patterns of infants whose mothers smoked during early pregnancy compared to infants whose mothers did not smoke at all [43].

Blood collection, DNA extraction, genome-wide methylation assay

Whole cord blood was collected immediately after birth from n=313 newborn singletons. For n=299 newborns, automated genomic DNA extraction was performed using the chemagic Magnetic Separation Module I (Chemagen Biopolymer-Technologie AG; Baesweiler; Germany). For n=14 newborns, a low volume of umbilical cord blood (<2 mL) was obtained. For these 14 samples, DNA was isolated using the QIAamp DNA Blood Midi Kit (Qiagen GmbH; Hilden; Germany). All genomic DNA samples were stored at -20 °C prior to analysis.

DNA methylation was measured using the Illumina Infinium HumanMethylation450K Beadchip.

Statistical analysis

Applied software

Quality control (QC) and all statistical analyses were performed using the R version 3.2.5 statistical analysis software [44], and the R-packages minfi [45], wateRmelon [46], Enmix [47], sva [48], limma [49], and mediation [50].

Data preprocessing, QC, and filtering

Methylation intensity signals were extracted from raw intensity data using the preprocessENmix procedure [47]. This includes background noise correction using out-of-band Infinium1 intensities as well as correction for dye bias based on internal control probes [51]. Data were then quantile normalized followed by Beta Mixture Quantile Dilation [BMIQ] [52]. Samples with insufficient DNA quality (average of medians of methylated and unmethylated signals <10.5; outlier status with regard to either averaged total intensity values or beta value distribution; insufficient bisulfite conversion; or failure in detection (detection P-value > 0.01) at more than 1% of positions) were excluded. Probes were excluded if any of the following criteria were met: beadcount <3; detection failure in more than 1% of samples; located within 10bp of a single nucleotide polymorphism (SNP); cross reactivity; X- or Y- linked status.

Data transformation, batch correction, and cell type adjustment

For all downstream analyses, methylation intensity data were converted to M-Values [53]. To detect batch effects, a principal component analysis was performed, based on the 10,000 sites with highest variance. The first 5 principal components (PC) were extracted. Extracted PCs were tested for association with possible batch effects using MANOVA and visual inspection of scatter plots. Detected batch effects were then removed using the ComBat procedure, as implemented in the sva package [54]. Following the removal of Sentrix ID and position related effects, no further batch effects were evident. Cell type composition was estimated using the Houseman reference based method, as implemented in the minfi package [55]. Adjustment for cell type composition was then performed by including the first five of a total of six cell type estimators as covariates in the regression models used for association testing.

Methylation association analysis

To avoid introducing additional heterogeneity with regard to birth weight, mothers whose pregnancy was complicated by premature delivery (gestational age <259 days) or treatment for gestational diabetes were excluded prior to analysis. To minimize the risk of spurious associations due to small sample size and thus avoid false positives, the analysis was strictly limited to high confidence smoking-related CpG sites that had shown significant association with smoking in a large recent meta-analysis [31], and CpG sites previously reported to mediate the effect of maternal smoking on birth weight [37, 38]. Association testing for methylation levels was performed using general linear models as implemented in the R-package limma. Gestational age, sex of the newborn, and parental height were included as covariates, since research has shown that these factors impact birth weight [56]. Adjustment was also made for maternal age. Cell type composition was taken into account, as described above. Sex and gestational age (weeks) were reported by the responsible obstetrician. Sex specific analyses were then conducted using the covariates above, with the exception of sex.

To exclude other significant loci in this sample, we also performed an EWAS on smoking (data not shown).

Mediation analysis

Sites showing significant association with smoking (FDR<0.05) were then used to test whether methylation levels partly mediate the effects of maternal smoking on birth weight. This was performed using a quasi-Bayesian approach as proposed in Imai et al. ([57]; for a more detailed explanation please see Additional file 1: Text), as implemented in R-package mediation [57], running 10,000 simulations. The same possible confounders as named above were included as additional covariates in the mediator and outcome regression models (gestational age, sex of the newborn, parental height, maternal age, cell type composition; in contrast to Kupers et al. [37] socioeconomic status was not included in the model as it was not associated with birth weight in our sample). For sex specific analyses sex of newborn was excluded from the models as above.

Gene-based analysis of Early Growth Genetics (EGG) Consortium GWAS results

Data on the trait “birth weight” were obtained from the EGG Consortium via the UK Biobank Resource. These data were downloaded from www.egg-consortium.org [58]. Gene-based analysis was performed using the ´birth weight summary statistic data 2016´ (file: BW3_EUR_summary_stats.txt.gz from http://egg-consortium.org/birth-weight-2016.html) and MAGMA v1.06[59]. The applied linkage disequilibrium (LD) structure was that of the 1000 Genomes project (http://www.internationalgenome.org/data). SNPs were assigned to a gene if the variant was located within 20kb flanking its transcript.

Results

After technical QC, a total of 405,654 sites and 311 individuals were in principle available for analysis. A total of n=5,527 of these CpG sites had previously been reported associated with smoking (see above) and were therefore selected for association testing. After excluding prematurely delivered newborns and individuals receiving anti-diabetic treatment during pregnancy the final sample size was n=282 individuals (138 male and 144 female). Of these, 13 males and 12 females were the offspring of mothers who had smoked throughout pregnancy. The general characteristics of the participants included in the present study are provided in Additional file 1: Table S1.

Impact of maternal smoking on birth weight

  1. a)

    The average difference in birth weight between the smoking and non-smoking groups was 209g. The newborns of smokers had a mean birth weight of 3,267g. The newborns of non-smokers had a mean weight of 3,476g (see Table 1; see also Additional file 1: Figure S2 for a more fine grained comparison between different groups).

  2. b)

    In female newborns (n=144), the average birth weight was 189g lower in the smoking subgroup compared to the non-smoking subgroup, i.e., birth weight was 5.6% lower when mothers smoked. This difference however did not reach statistical significance. In male newborns (n=138), the average birth weight was significantly lower by 242g in the smoking subgroup compared to the non-smoking subgroup, i.e., birth weight was 6.7% lower when mothers smoked (see Table 1).

Table 1 Average birth weight of newborns

Methylation association analysis

Following correction for multiple testing, a total of 30 CpG sites showed significant differential methylation in the smoking subgroup (see Table 2). These CpGs map to 13 genes (AHRR, CNTNAP2, CYP1A1, FRMD4A, GFI1, ITGB7, MIR548F3, MYO1G, PIM1, RNF157, SAMD3, TFEB, UNC45B).

Table 2 Top differentially methylated CpGs (associated with maternal smoking after correction for multiple testing)

The EWAS on smoking revealed no further epigenome-wide associated CpG sites (data not shown.

There was no significant association between DNA extraction method and methylation levels using a linear modelling approach as well as including the extraction methods as a covariate (data not shown).

Mediation analysis

Of the 30 CpG sites found differentially methylated after maternal smoking in our sample, the following were found to mediate the effect of maternal smoking on birth weight: cg25325512 (PIM1, p=0.005); cg25949550 (CNTNAP2, p=0.008); and cg08699196 (ITGB7, p=0.045). Sex-specific analyses for these three CpG sites revealed that cg25949550 (CNTNAP2, p=0.022) mediated the effect of maternal smoking on birth weight in male newborns (see Table 3).

Table 3 Results of mediation analysis

Gene-based analysis of EGG Consortium GWAS results

In the gene-based-analysis, PIM1, CNTNAP2, and ITGB7 were tested for association with birth weight. ITGB7 showed significant association (p=8.24x10-7). PIM1 and CNTNAP2 failed to achieve nominal significance (p>0.05). Single marker p-values of the SNPs at the ITGB7 locus are listed in Additional file 1: Table S2.

Discussion

The aims of the present study were to replicate the finding that the association between maternal smoking and birth weight is mediated by methylation, and to investigate whether the observed mediation effects are sex-specific. Differentially methylated CpG sites were detected in 13 genes, including AHRR, GFI1, MYO1G, CYP1A1, and CNTNAP2. These represent the top five differentially methylated genes reported in adult smokers [16,17,18, 60, 61], and in previous studies of newborns exposed to maternal smoking [25,26,27, 31, 32, 37].

Mediation analysis of the 30 CpG sites revealed that CpG sites in the genes PIM1, CNTNAP2, and ITGB7 mediated the effect of maternal smoking on birth weight in the complete sample. The serine/threonine-protein kinase PIM1 has a marked anti-apoptotic effect, and its level is increased in lung tissue following exposure to cigarette smoke [62]. Interestingly, it has been shown that an enhanced gene expression of PIM1 - corresponding to a decreased methylation level - protects against cell death induced by cigarette smoke and neutrophilic airway inflammation [62]. In our sample, the methylation level of cg25325512 in PIM1 is lower in smokers than in non-smokers. A lower methylation level can lead to an upregulation of gene expression. Thus, upregulation of PIM1 in smokers could be an adaptive process to protect against negative effects of cigarette smoking. In monkeys, research has shown that PIM1 is associated with body mass indeces after calorie restriction [63]. PIM1 belongs to the PIM serine/threonine kinase family, which is involved in the regulation of cell survival. PIM kinases are constitutively active, and regulate cell growth, differentiation, and apoptosis [64]. The Contactin Associated Protein-Like 2 gene (CNTNAP2) encodes a member of the neurexin family, whose members function as cell adhesion molecules and receptors in the nervous system of vertebrates. Genetic variation in CNTNAP2 has been associated with the regulation of body weight [65]. Interestingly, research has also implicated CNTNAP2 in multiple neurodevelopmental disorders, including Gilles de la Tourette syndrome, schizophrenia, epilepsy, autism, attention deficit and hyperactivity disorder, and mental retardation [66]. CNTNAP2 may play a role in the formation of functionally distinct domains critical for the saltatory conduction of nerve impulses in myelinated nerve fibers. The methylation level of the mediating CpG cg25949550 in CNTNAP2 is lower in infants whose mothers smoked during pregnancy, presumably leading to higher gene expression. As loss of CNTNAP2 has been shown to lead to deficits in axonal excitability [67], lower methylation of this gene might also be an adaptive process to protect against negative effects of cigarette smoking. The Integrin Subunit Beta 7 gene (ITGB7) encodes a protein that is a member of the integrin superfamily. Members of this family are adhesion receptors, which are involved in signaling from the extracellular matrix to the cell. High expression of ITGB7 has been found to be associated with poor survival of cancer cells [68]. In the present study, the methylation level of cg08699196 in ITGB7 is increased, which may reduce its gene expression. A reduction in of ITGB7 expression might result in higher survival of cancer cells and, thus, oncogenesis.

The genes PIM1, CNTNAP2, and ITGB7 were not implicated in previous EWAS of birth weight [69,70,71]. However, in a gene-based analysis of genetic data obtained from a genome-wide association analysis of birth weight by the EGG Consortium [58], the present authors found that ITGB7 was strongly associated with birth weight (p=4, 2x10-7). Furthermore, ITGB7 was found to be associated with the pathophysiology of childhood obesity in a Hispanic population [72].

As methylation levels are known to be sex-specific already at birth, the present study involved sex-specific mediation analyses. Unsurprisingly, these suggest that the mediation effects can also be dependent on sex. The CpG site in CNTNAP2 had a more pronounced effect in male newborns, and failed to reach significance in females. Further sex-specific mediation effects are possible. However, their detection will require larger samples.

In contrast to previous studies, parental height was included as a covariate in the present analyses, as it had a pronounced influence on birth weight in our study. Performance of the analysis without this covariate did indeed obtain 45 rather than 30CpG sites associated with smoking. In the mediation analysis of these 45 CpG sites, three further sites became significant in addition to PIM1 and CNTNAP2 (see Additional file 1: Table S3). Determining whether the respective findings are true or false positives is problematic, and can only be resolved through the performance of larger studies and metaanalyses.

The present study had several limitations in terms of the investigated smoking phenotype. Smoking status was conceptualized as a dichotomized trait, and no distinction was made between light and heavy smokers. Furthermore, smoking was assessed using retrospective self-reports rather than prospective and objective measures such as cotinine levels. This may have led to an underestimation of the number of smoking mothers, and thus to an underestimation of the direct effect, and an overestimation of the mediation effect, of smoking. This phenomenon was reported recently by Valeri et al. in a study on the Norwegian MoBa cohort [73]. However, unreliable self-reporting of smoking status in the present cohort is unlikely for two reasons. First, the subjects had a high relationship of confidence as they received regular obstetric care at the recruitment centers throughout pregnancy. Second, the direct effect of smoking on birth weight in the present cohort (average reduction in birth weight of ~200g) was more pronounced than that reported in the MoBa study of Valeri et al (~90g). The rate of false self-reporting is likely to be in accordance with rates reported in previous studies.

Conclusions

The present study supports reported findings that DNA methylation may represent a biological mechanism through which maternal smoking impacts birth weight. Unsurprisingly, this effect may be sex-dependent, as suggested for the first time in the present analyses. Further studies are warranted to investigate the role of the identified differentially methylated loci in mediating the association between maternal smoking during pregnancy and birth weight, and their role in determining offspring phenotypes in later life.