A population-based resource for intergenerational metabolomics analyses in pregnant women and their children: the Generation R Study.

INTRODUCTION
Adverse exposures in early life may predispose children to cardio-metabolic disease in later life. Metabolomics may serve as a valuable tool to disentangle the metabolic adaptations and mechanisms that potentially underlie these associations.


OBJECTIVES
To describe the acquisition, processing and structure of the metabolomics data available in a population-based prospective cohort from early pregnancy onwards and to examine the relationships between metabolite profiles of pregnant women and their children at birth and in childhood.


METHODS
In a subset of 994 mothers-child pairs from a prospective population-based cohort study among pregnant women and their children from Rotterdam, the Netherlands, we used LC-MS/MS to determine concentrations of amino acids, non-esterified fatty acids, phospholipids and carnitines in blood serum collected in early pregnancy, at birth (cord blood), and at child's age 10 years.


RESULTS
Concentrations of diacyl-phosphatidylcholines, acyl-alkyl-phosphatidylcholines, alkyl-lysophosphatidylcholines and sphingomyelines were the highest in early pregnancy, concentrations of amino acids and non-esterified fatty acids were the highest at birth and concentrations of alkyl-lysophosphatidylcholines, free carnitine and acyl-carnitines were the highest at age 10 years. Correlations of individual metabolites between pregnant women and their children at birth and at the age of 10 years were low (range between r = - 0.10 and r = 0.35).


CONCLUSION
Our results suggest that unique metabolic profiles are present among pregnant women, newborns and school aged children, with limited intergenerational correlations between metabolite profiles. These data will form a valuable resource to address the early metabolic origins of cardio-metabolic disease.


Introduction
Cardio-metabolic diseases are of major public health concern (NCD Risk Factor Collaboration 2016. The pathogenesis of these cardio-metabolic diseases involves adaptations in metabolic pathways. Thus far, studies mainly focused on a small set of conventional biomarkers to assess metabolic status and pathways. Recent developments in high-throughput technologies and analytical methods have enabled the application of metabolomics for detailed characterization of an individual's metabolic status on a large scale (Bictash et al. 2010;Tzoulaki et al. 2014;van Roekel et al. 2019). Metabolomics measures a large number of low molecular weight metabolites in biological tissues and fluids. The metabolome is the most downstream component of biological processes and closely linked to the phenotype. It carries information about gene expression as well as lifestyle-and environmental factors (Tzoulaki et al. 2014;van Roekel et al. 2019). Metabolomics has already been successfully applied in large-scale epidemiological studies, mainly in adult populations, to identify new biomarkers of cardio-metabolic disease status, development and progression, as well as the underlying pathophysiological mechanisms (Newgard 2017;Rangel-Huerta et al. 2019;Ussher et al. 2016).
Accumulating evidence suggests that cardio-metabolic diseases might originate in early life. Adverse exposures in early life may lead to developmental adaptations in organ structure or function, which may predispose these children to later cardio-metabolic disease (Gluckman et al. 2008). Early-life developmental adaptations in metabolic pathways may underlie these associations. Only a limited amount of metabolomics studies on the early origins of cardio-metabolic disease have been performed. Most of these studies were small and mainly assessed cross-sectional relationships (Hivert et al. 2015;Rauschert et al. 2017a). Also, it is unclear whether metabolite profiles correlate between mothers and their children. The application of metabolomics in longitudinal birth cohort studies may serve as a valuable tool to identify biomarkers of metabolic status, in order to disentangle the mechanisms linking adverse exposures in early life to cardio-metabolic disease later in life (Hivert et al. 2015).
Therefore, in a population-based cohort from early pregnancy onwards among 994 mother-child pairs from Rotterdam, the Netherlands, we obtained serum concentrations of a range of metabolite groups involved in energy metabolism, including amino acids (AA), non-esterified fatty acids (NEFA), phospholipids (PL), and carnitines (Carn) in maternal blood in early-pregnancy, and child's (cord-) blood at birth and at age 10 years. We provide a detailed description of the data acquisition, processing and data structure and examined the relationships between metabolite profiles of pregnant women and their children at birth and in childhood.

Study population
The Generation R Study is a multi-ethnic population-based prospective cohort study from fetal life until adulthood in Rotterdam, the Netherlands, described in detail previously (Kooijman et al. 2016). The study was approved by the Medical Ethical Committee of the Erasmus Medical Center, University Medical Center, Rotterdam (MEC 198.782/2001/31). Written informed consent was obtained from all mothers at enrollment in the study. Measurement of conventional biomarkers of metabolic status in pregnancy and childhood has been described previously (Adank et al. 2019;Geurtsen et al. 2019;Silva et al. 2019). For metabolomics, 2,395 blood samples were analyzed from a subsample of 1041 Dutch mother-child pairs who had their blood drawn at birth (cord blood) and at least 1 other time point: early pregnancy (mother) or at the age of 10 years (child). A number of blood samples (n = 157) was excluded during data acquisition (e.g. low sample volumes, hemolytic samples) and processing (e.g. duplicate samples, high proportion of missing values, missing or non-Dutch ethnicity), leaving a total of 2,238 blood samples from 994 mother-child pairs available for analysis. Of these 994 mother-child pairs, a total of 814 mothers had early pregnancy data available, and 921 and 503 children had data available at birth and at the age of 10 years, respectively. Of all mothers included, 10 had a twin pregnancy. Metabolomics data was only available for one of the twins. Therefore, mothers with twin pregnancies were included only once in the dataset.

Sample collection and processing
Maternal early-pregnancy non-fasting blood samples were obtained at enrollment in the study [median gestational age: 12.8 weeks (95% range 9.9, 16.9)] by research nurses at one of the dedicated research centers (Kruithof et al. 2014). Umbilical venous cord blood samples were collected directly after birth [median gestational age at birth: 40.3 weeks (95% range 36.6, 42.4)] by a midwife or obstetrician. Child's nonfasting blood samples were obtained by research nurses at the 10-year follow-up visit to the research center [median age: 9.8 years (95% range 9.1, 10.6)]. All blood samples were transported to the regional laboratory (STAR-MDC), spun and stored at − 80 °C for further studies within a maximum of 4 h after collection. For metabolite measurements, blood serum samples were transported on dry ice to the Division of Metabolic and Nutritional Medicine of the Dr. von Hauner Children's Hospital in Munich, Germany.

Metabolite measurements
A targeted metabolomics approach was adopted to determine serum concentrations (µmol/L) of AA, NEFA, PL and Carn (Hellmuth et al. 2017b). Detailed information is given in Supplemental Text S1 and Table S1. Briefly, AA were analyzed with 1100 high-performance liquid chromatography (HPLC) system (Agilent, Waldbronn, Germany) coupled to a API2000 tandem mass spectrometer (AB Sciex, Darmstadt, Germany) (Harder et al. 2011). IUPAC-IUB Nomenclature was used for notation of AA (IUPAC-IUB Joint Commission on Biochemical Nomenclature 1984). NEFA, PL and Carn were measured with a 1200 SL HPLC system (Agilent, Waldbronn, Germany) coupled to a 4000QTRAP tandem mass spectrometer from AB Sciex (Darmstadt, Germany) (Hellmuth et al. 2012;Uhl et al. 2016). The analytical technique used is capable of determining the total number of total bonds, but not the position of the double bonds and the distribution of the carbon atoms between fatty acid side chains. We used the following notation for NEFA, PL and Carn.a:X:Y, where X denotes the length of the carbon chain, and Y the number of double bonds. The 'a' denotes an acyl chain bound to the backbone via an ester bond ('acyl-') and the 'e' represents an ether bond ('alkyl-'). For analyses, we categorized metabolites in to general metabolite groups based on chemical structure (AA, NEFA, PC.aa, PC.ae, Lyso.PC.a, Lyso.PC.e, SM, Free Carn and Carn.a) and in detailed metabolite subgroups based on chemical structure and physiological and biological relevance (AA: BCAA, aromatic amino acids (AAA), essential AA, non-essential AA; NEFA, PC.aa, PC.ae, Lyso.PC.a, Lyso.PC.e and SM: saturated, mono-unsaturated, poly-unsaturated; Carn.a: short-chain, medium-chain, long-chain).

Quality control and pre-processing
To assess the precision of the measurements, six quality control (QC) samples per batch were consistently measured between study samples. After exclusion of outliers, the coefficients of variation (CV; SD/mean) for each batch (intra-batch) and for all batches (inter-batch) of the QC samples were calculated for each metabolite. In line with previous studies Lindsay et al. 2015;Rauschert et al. 2017b;Shokry et al. 2019), for each metabolite we excluded batches with an intra-batch CV higher than 25%. Data on complete metabolites were excluded for metabolites with inter-batch CV higher than 35% or if less than 50% of the batches passed the QC (i.e. had an intrabatch CV lower than 25%). To correct for batch effects, the participant data at each time point were median corrected by dividing the metabolite concentration by the ratio of the intra-batch median and the inter-batch median of the QC samples (Shokry et al. 2019). In line with previous studies, metabolites and participants with more than 50% of missing values were excluded Shokry et al. 2019). Missing values in other participants were imputed using the Random Forest algorithm (R package missForest), which has been shown to perform well with MS data (Wei et al. 2018).

Statistical analysis
First, we calculated the sum of individual metabolite concentrations per general and detailed metabolite group and per time point. In order to explore the variability of the metabolites between participants and between time points, we obtained the median (95% range) for the individual metabolites and the summed metabolite concentrations per general and detailed metabolite group per time point. To enable comparison between time points, only metabolites that were present at each time point were included in the summed variables. Second, we explored the dimensionality of the data, by conducting principal component analyses (PCA) at each time point separately. As log transformations did not sufficiently normalize the metabolite concentrations, we used square root transformations to normalize metabolite concentrations. These normalized metabolite concentrations were subsequently standardized by calculating standard deviation scores [SDS; (observed value − mean)/SD]. Third, as we considered PCA not informative for describing the information contained in our dataset, we further explored the correlation structure of the data by calculating pairwise Pearson's correlations coefficients between all individual metabolites within each time point and between individual metabolites at different points. These correlations within and between time points were visualized using two circos plots (R package circlize) (Chung et al. 2018;Gu et al. 2014). To facilitate presentation, the first plot only includes correlation coefficients < − 0.15 and > 0.15. To display correlation coefficients that are at least of weak magnitude, the second plot displays correlation coefficients < − 0.30 and > 0.30 (Hinkle et al. 2003). To obtain further insight in possible metabolic pathways, we additionally presented correlations between metabolites within a time point as correlation networks, as correlations between metabolites were strongest within time points (Rosato et al. 2018). To provide a numerical summary of the strength of the correlations, we additionally constructed heatmaps of the median absolute correlation coefficients within general and detailed metabolite groups and between general and detailed metabolite groups at each of the time points separately. We calculated the correlation coefficients for correlations between individual metabolites at different time points. Correlations of 0-0.29, 0.3-0.49, 0.5-0.69, 0.7-0.89, and 0.9-1.0 were considered to be very low, low, moderate, high and very high, respectively (Hinkle et al. 2003). As sex differences in metabolite concentrations may exist (Ellul et al. 2019), we repeated steps one and three stratified by child's sex. The statistical analyses were performed using R version 3.3.4 (R Foundation for Statistical Computing) (R Core Team 2015). Table 1 provides general characteristics of the study population. Of the 994 mother-child pairs with data available, 125 (12.6%), 494 (49.7%) and 375 (37.7%) had data available at 1, 2, or 3 time points, respectively.

Variability
Data was available on a total of 196 metabolites, of which 195 metabolites in early pregnancy, 194 metabolites at birth and 181 metabolites at child's age 10 years. Descriptive information is provided in Supplemental Table S2.  Table S2 gives the summed concentrations of the detailed metabolite subgroups, which followed similar patterns. Supplemental Fig. S1 shows that the summed metabolite concentrations did not differ by child's sex. Table 2 shows the number of components (PCs) required to explain percentages of cumulative variance at each time point. At each time point, a relatively high number of PCs was needed to explain > 85% of the variance. The obtained PCs did not clearly represent specific metabolic pathways (Supplemental Figs. S2-S4).  Figure 2a shows all correlations lower than − 0.15 or higher than 0.15, whereas Fig. 2b shows all correlations lower than − 0.30 or higher than 0.30. At all time points, relatively high correlations were observed of individual metabolites within general metabolite groups and between individual metabolites from the different PL groups (PC.aa, PC.ae, Lyso.PC.a, Lyso.PC.e, and SM), between AA and Carn.a, and between NEFA and Carn.a. These correlations were mainly of positive direction, except some of the correlations between AA and Carn.a. In children of age 10 years only, some of the AA were negatively correlated with NEFA. Presentation of these correlations within pregnant women, children at birth and children at age 10 years as correlation networks showed the strongest correlations for individual metabolites within general metabolite groups (Supplemental Fig. S5).

Correlation structure
To provide further insight into the strength of these correlations, Fig. 3a-c summarizes the correlations as the median absolute correlations of individual metabolites within general and detailed metabolite groups (diagonal) per time point. The median absolute correlations between general and detailed metabolite groups per time point are shown off-diagonal. Median absolute correlations within general and detailed metabolite groups at the same time point were low to high, and ranged between r = 0.27 and r = 0.92. The strength of these within-group median correlations differed by detailed metabolite subgroup, with BCAA, mono-unsaturated NEFA, mono-unsaturated PC.aa, mono-unsaturated PC.ae, saturated Lyso.PC.e, mono-unsaturated SM and long-chain Carn.a generally having the highest median correlations within their respective general groups. Median absolute correlations between subgroups of different metabolite groups were very low, except for correlations between NEFA detailed subgroups and medium-chain Carn.a in early pregnancy (r ranging between 0.24-0.34) and at age 10 years (r ranging between 0.23-0.44), between BCAA and AAA and short-chain Carn.a in early pregnancy (r = 0.26 and r = 0.33, respectively) and at age 10 years (r = 0.30 and r = 0.25, respectively), and between BCAA and short-chain Carn.a (r = 0.33) at birth. Table 3 shows correlations of individual metabolites between each of the time points. For presentation purposes, this table only gives the 30 strongest correlations at each combination of time points, all correlations given in Supplemental Table S3. Correlations between early pregnancy and child's metabolites at birth mainly included Free Carn, and Carn.a, and some long chain-and very long chain NEFA and some mainly non-essential AA. Correlations between early pregnancy and child age 10 years included a few AA and some PC.aa. In children, metabolites correlated between birth and age 10 years mainly included phospholipids. Almost all correlations were very weak, except the correlations between early pregnancy and birth Free Carn (r = 0.35) and Carn.a C9:0 (r = 0.32). Supplemental Figs. S6 and S7 show that the correlations between individual metabolites and median absolute correlations, respectively, were similar for boys and girls.

Discussion
We described the data acquisition, processing and structure of the metabolomics data available in the Generation R Study and assessed the relationships between metabolite profiles of pregnant women and their children at birth and in childhood. Metabolite concentrations vary considerably between pregnant women and their children at birth and at the age of 10 years. The individual metabolites correlate within groups of metabolites with similar chemical structures, but to a lesser extent between groups of metabolites with different chemical structures. The correlations of individual metabolites between pregnant women and their children at birth and age 10 years are relatively low.

Interpretation of main findings
Metabolomics studies targeting cardio-metabolic diseases have already been successfully applied in adults (Newgard   (Hivert et al. 2015;Rauschert et al. 2017a). We obtained intergenerational metabolomics data at three different time points during pregnancy and postnatal life, that may provide more detailed insights in the early origins of cardio-metabolic disease, the underlying mechanisms and identify potential novel biomarkers. Maternal metabolic profile during pregnancy might influence fetal metabolic profile, either directly through placental transfer, or indirectly by influences on hormone levels or placental function (Hivert et al. 2015). Maternal blood metabolite concentrations generally tend to decrease across pregnancy, likely reflecting increased circulating volume, tissue biosynthesis and placental uptake (Lindsay et al. 2015). Fetal metabolite concentrations are the result of both placental transfer and endogenous synthesis. Concentrations of AA, Carn and NEFA, particularly long-chain polyunsaturated fatty acids (LC-PUFA), tend to be higher in fetal blood than in maternal blood (Larque et al. 2011;Regnault et al. 2002;Schmidt-Sommerfeld et al. 1985). This might be indicative of an active transport mechanism across the placenta or increased fetal synthesis. Although the large time differences between the metabolite measurements in our study should be noted and preclude direct conclusions about placental transfer, our observation that the summed concentrations of AA, NEFA and Carn.a were higher in cord blood than in maternal early pregnancy blood is in line with these previous studies. The lower PL concentrations observed in cord blood in comparison to maternal early pregnancy blood might be explained by the fact that PL do not cross the placenta, but are hydrolyzed to NEFA that in turn cross the placental barrier (Herrera and Ortega-Senovilla 2010;Larque et al. 2011;Rice et al. 1998). Relatively high correlations between individual metabolites within known general and detailed metabolite subgroups in pregnant women as well as in cord blood were observed, as expected from the shared precursors and biosynthesis pathways. However, correlations of individual metabolites between these two time points were relatively weak. These results are in line with those from a multi-ethnic study among 1600 participants that showed mostly weak correlations of these metabolites between maternal blood at 28 weeks of gestation and cord blood (Lowe et al. 2017). In our study, there is a large time difference between the metabolite measurements in mothers and newborns. Therefore, the relatively low correlations between maternal and cord blood metabolites might result from changes in metabolism in both pregnant women and the fetus that occur throughout pregnancy (Herrera and Ortega-Senovilla 2010;Lindsay et al. 2015). In addition, placental transfer of nutrients throughout pregnancy is tightly regulated by various transport mechanisms to ensure stable fetal metabolite concentrations at the expense of variations in maternal metabolite concentrations (Larque et al. 2013;Rossary et al. 2014  Less is known about the metabolite profiles from birth throughout childhood and the influence of maternal metabolite profiles in pregnancy on these profiles. A study among 127 children from Sweden showed that concentrations of conventional lipids, including total cholesterol, LDL cholesterol and HDL cholesterol increased between the age of 6 months and 4 years, whereas triglyceride concentrations decreased (Ohlund et al. 2011). A study among 500 children and adolescents aged 0 to 19 years observed that concentrations of AA, NEFA, and Carn.a dropped after the neonatal period. However, some of these Carn.a increased again from the age of 7 years and returned to neonatal concentrations at age 19 years (Teodoro-Morrison et al. 2015). A large familial resemblance in metabolite concentrations has been suggested, which seems to be largely genetic (Draisma et al. 2013;Kettunen et al. 2016;Rueedi et al. 2014). In cross-sectional studies, correlations of metabolites between parents and their offspring vary strongly, ranging from weak to relatively strong (Ellul et al. 2019;Halvorsen et al. 2015;Ohlund et al. 2011). Partly in line with these previous studies, we observed that AA and NEFA concentrations were lower in childhood as compared to cord blood samples, whereas concentrations of PL and Carn were higher in childhood. However, the correlations between individual metabolite concentrations of children at birth and at the age of 10 years as well as between mothers in early pregnancy and their children at the age of 10 years were very weak. This might be explained by the large timespan between the measurements. Also, previous research has indicated that metabolite concentrations are highly influenced by nutritional factors, physical activity and the gut microbiome (Hellmuth et al. 2019;Lau et al. 2018;Palmnas et al. 2018;Pedersen et al. 2016;Wang et al. 2011). Differences in these factors between mothers and their children and over time might explain the weak correlations between different time points. Previous studies observed sex differences in metabolite concentrations in both children and adults (Ellul et al. 2019;Teodoro-Morrison et al. 2015). We did not observe metabolite concentrations to vary between the sexes. This could be explained by the relatively young age of the participants, as sex differences in metabolite concentrations have been shown to be more pronounced in adolescence and adulthood (Ellul et al. 2019;Teodoro-Morrison et al. 2015). Thus, correlations between individual metabolites between pregnant women and their children at school-age and within children over time are very low. This might suggest strong influences of external factors and limited intergenerational correlations of metabolite profiles. We provided the first explorative analyses of a unique large longitudinal dataset consisting of metabolomics data of pregnant women and their children at birth and in childhood, and studied correlations between a large number of metabolites at these different time points. Not much is known yet about the correlations of metabolites between pregnant women and their children and the metabolite profiles in children from birth until childhood. We observed relatively low correlations of metabolite concentrations between time points. We explored whether offspring sex affected these correlations as this is an important baseline characteristic which has been suggested to influence metabolite profiles in children and adults, but this did not affect our findings. Other maternal and childhood factors are likely to influence metabolite profiles in pregnant women, and the development of metabolites profiles from birth until childhood. Further studies are needed to obtain detailed insight into the influence of maternal and offspring socio-demographic, lifestyle and physical factors on the stability of metabolites profiles in pregnancy and from birth throughout childhood. Future studies using these data should take into account the correlations of metabolites within the same metabolite group. PCA, a data reduction approach commonly used in metabolomics, showed that the data were highly dimensional. This indicates that the variability in the data is difficult to capture in a lower number of components and that each metabolite contributes unique information. In addition, the obtained components did not describe specific metabolic pathways. Therefore, we do not consider the PCs informative in describing the information contained in this dataset. Given the high dimensionality of the data and the relatively high   correlation of metabolites within metabolite groups, it seems that future studies focused on relating these data to exposures and outcomes of interest should analyze the data per individual metabolite and per metabolite group with structural, physiological and biological relevance. In addition, correlation networks based on correlations between individual metabolites or more advanced pathway analysis may be useful for identifying metabolic pathways involved in these associations. Due to the longitudinal nature of the data and the large amount of data on relevant exposures and outcomes available in the cohort, these data will form an important population-based resource for future metabolomics analyses on the developmental origins of cardiometabolic disease.

Methodological considerations
We obtained metabolomics data in a subgroup of the cohort, which consists of Dutch, relatively high educated and healthy participants, as compared to the full cohort (Kooijman et al. 2016). This may affect the generalizability of our sample to the full cohort and the general population. We adopted a targeted metabolomics approach, which enabled us to study absolute metabolite concentrations of metabolites known a priori to be relevant for obesity and cardiometabolic disease. However, the targeted design might also be a limitation in future association studies, as relevant biological pathways might be missed. The blood samples used in our study were non-fasting and taken during non-fixed times of the day for logistic and ethical reasons (relatively young age of the children). Metabolite concentrations are dependent on fasting status. Fasting blood samples are usually preferred, as they are more reliable over time (Carayol et al. 2015). The use of non-fasting blood samples in our study might influence precision and power to detect associations of interest. However, non-fasting blood samples appear to be more informative of metabolic status throughout the day. Also, non-fasting lipids have been shown to perform equally or even better than fasting lipids in predicting the risk of cardiovascular disease (Nordestgaard et al. 2016). We therefore still consider non-fasting metabolite concentrations to be of interest. Due to the longitudinal design of the study, we were able to measure metabolite concentrations at 3 different time points during pregnancy and early postnatal life. However, due to the large time intervals between the blood samples and differences in the nature of the blood samples, small differences in procedures and handling of the blood samples may exist. As previous studies showed that different pre-storage temperatures and durations only minimally affected measured concentrations of most metabolites, we consider it unlikely that this strongly influenced our results.

Conclusions
Metabolite concentrations vary between pregnant women and their children at birth and at the age of 10 years. Correlations of individual metabolites between pregnant women and their children at birth and in childhood are relatively low. This may suggest that unique metabolic profiles are present among pregnant women, newborns and school aged children, with limited intergenerational correlations between metabolite profiles. These data are an important populationbased resource for future metabolomics analyses to address the early origins of cardio-metabolic disease.