Glycerophospholipid and detoxification pathways associated with small for gestation age pathophysiology: discovery metabolomics analysis in the SCOPE cohort

Small for gestational age (SGA) may be associated with neonatal morbidity and mortality. Our understanding of the molecular pathways implicated is poor. Our aim was to determine the metabolic pathways involved in the pathophysiology of SGA and examine their variation between maternal biofluid samples. Plasma (Cork) and urine (Cork, Auckland) samples were collected at 20 weeks’ gestation from nulliparous low-risk pregnant women participating in the SCOPE study. Women who delivered an SGA infant (birthweight < 10th percentile) were matched to controls (uncomplicated pregnancies). Metabolomics (urine) and lipidomics (plasma) analyses were performed using ultra performance liquid chromatography-mass spectrometry. Features were ranked based on FDR adjusted p-values from empirical Bayes analysis, and significant features putatively identified. Lipidomics plasma analysis revealed that 22 out of the 33 significantly altered lipids annotated were glycerophospholipids; all were detected in higher levels in SGA. Metabolomic analysis identified reduced expression of metabolites associated with detoxification (D-Glucuronic acid, Estriol-16-glucuronide), nutrient absorption and transport (Sulfolithocholic acid) pathways. This study suggests higher levels of glycerophospholipids, and lower levels of specific urine metabolites are implicated in the pathophysiology of SGA. Further research is needed to confirm these findings in independent samples.


Introduction
Small for gestational age (SGA) is defined as an infant born with a birthweight less than the 10th percentile when compared to population (weight according to gestational age at birth) or to customised curves (charts individualized on the basis of maternal characteristics) (Sharma et al. 2016b). SGA is associated with placental dysfunction (Dessì et al. 2015) and is a major cause of maternal and neonatal morbidity (Diderholm 2009). Infants born with SGA are at higher risks of perinatal mortality and neonatal complications, such as asphyxia (Rosenberg 2008) and neurological impairment (Grantham-McGregor 1998;Sharma et al. 2016a), as well as longer term morbidity, including increased risk of cardiovascular disorders and type 2 diabetes (Dessì et al. 2012). The lower the birth centile, the higher the risk of short and long term morbidity. SGA can be further classified as moderate (birthweight centile in the 3rd to 10th percentile), and severe (birthweight less than the 3rd percentile) (Lee et al. 2003). A better understanding of the molecular pathways involved in the pathophysiology of SGA could enable better prevention, earlier detection and potential treatment of this pregnancy complication.
Metabolomics is the study of all small weight molecules (50-2000 Da), or metabolites, present in a sample. Metabolic profiling in a clinical research setting has led to the determination of risk factors and pathophysiology of specific diseases . Three metabolomics studies using plasma or serum samples showed that the physiopathology of SGA appears to affect lipid and fatty acid metabolism (Horgan et al. 2011;Sulek et al. 2014;Delplancke et al. 2018). In the Greek Rhea cohort study, urine samples taken at 11 weeks of gestation, showed an association between elevated levels of acetate, tyrosine, formate, trimethylamine, lysine and glycoprotein and higher risks of subsequent SGA and preterm birth (Maitre et al. 2014).
To maximise the opportunity to observe metabolic changes that may explain pathophysiological changes seen in SGA, we have analysed maternal plasma and urine. We hypothesised that these changes would precede detection of disease. The aim of our study was to gain further insight into the metabolic and lipidomic pathways involved in the pathophysiology of SGA, using urine and plasma metabolic profiles of women at 20 weeks of gestation, in geographically distinct populations of the SCOPE pregnancy cohort.

Participants
The present nested case-control study was performed on samples selected from SCreening fOr Pregnancy Endpoints study (SCOPE, www.scope study .net) in Cork, Ireland, and Auckland, New Zealand. This study was performed in accordance with the 1964 Helsinki declaration and its later amendments. Informed consent and ethical approval were obtained (Ireland ECM5 (10) 05/02/08, and New Zealand AKX/02/00/364). SCOPE is an international pregnancy cohort that recruited 1773 women in Ireland and 2034 in New Zealand with low-risk and singleton pregnancies (Kenny et al. 2014). Selected participants who delivered small for gestational age babies, with customised birthweights less than the 10th centile (cases), were matched to participants who had healthy and uncomplicated pregnancies (controls). Controls were matched to cases by age (± 5 years), body mass index (BMI ± 3.5 kg/ m 2 ), and ethnicity (Table 1). For definitions of all clinical endpoints used in Table 1, see Australian New Zealand Clinical Trials Registry (ANZCTR), using study number ACTRN12607000551493. Urine and non-fasted plasma samples were obtained from 40 selected SGA cases in Cork population and matched with 40 controls. In the Auckland population, only urine samples were analysed, with a similar case-control design: 40 women were selected for the SGA group and matched with 40 controls. The nested case-control study designs are summarised in Fig. 1.
Our study included 40 cases and 40 controls with an estimated 95% power to detect small to medium effects. The calculation was performed using MEDCALC® (www. medca lc.com), and showed that assuming a Type I error rate, α, of 5% is sufficient and that will be at least a 50% change in mean (with a similar within sample standard deviation), with the samples size of 40 cases and 40 controls, that the power of the study was 0.9474.

Sample preparations
For both Cork and Auckland population, plasma samples were prepared and analysed independently. Cork urine samples were prepared and analysed independently of plasma analysis.

Plasma samples
Plasma samples taken at 20 weeks of gestation were prepared in a randomised order. The lipid extraction method used was based on a protocol described by Matyash et al (2008). Plasma samples were taken out of − 80 °C storage and left on ice until thawed; none had been previously thawed. Plasma samples (200 µl) were transferred to glass tubes, pre-chilled methanol (− 20 °C, 600 µl) was added, and vortex mixed for 1 min. MTBE was added (5 ml) to the samples, which were then incubated at room temperature on a shaker for 1 h. Water (1 ml) was added and samples were left on a shaker for 10 min at room temperature; samples were then centrifuged at room temperature for 10 min at 1000 g. The organic (top) layer was transferred into glass tubes and left to evaporate overnight, at room temperature. Each sample was reconstituted in 200 µl of 65:30:5, IPA:ACN:Water and vortex mixed for 30 s. Lastly, all samples were transferred to LC glass vials in preparation for LC-MS analysis. Quality control (QC) samples were created by pooling 10 µl from each extracted sample. Multiple aliquots of pooled QC (100 µl per aliquot) were constructed using several LC vials.

Urine samples
Urine samples taken at 20 weeks of gestation were prepared in a randomised order, following the Nature Protocol developed by Want et al (2010). Samples stored at − 80 °C, were defrosted on ice, then centrifuged for 10 min at 10,000 g. Supernatant (100 µl) was transferred to a new tube and mil-liQ water with 0.1% of formic acid (FA, 200 µl) was added (ratio 1:2, urine: water 0.1%FA). Samples were then transferred to LC-MS glass vials. Quality control samples (QCs) were prepared by pooling 20 µl from each sample; sample preparation was then the same as for other samples with 100 µl of pooled QC subsequently transferred to several LC vials.

Plasma samples
LC-MS analysis was performed using an ACQUITY ultra performance liquid chromatography (Waters Corp, Milford, MA) coupled with a Synapt G2-S quadrupole time-of-flight (UPLC-Q-TOF) mass spectrometer (Waters Corp, Wilmslow, UK). Samples were analysed on a BEH C 18 column (1.7 µm, 2.1 × 100 mm) which was maintained at 65 °C, whilst samples were maintained at 7 °C. Analytes were separated over a 23 min gradient using a flow rate of 0.4 mL/min. Mobile phase A was a mix of 10 mM of ammonium formate in acetonitrile:water (ACN:H 2 O, 60:40 (v:v)), and B was a mix of 10 mM of ammonium formate in isopropanol:acetonitrile (IPA:ACN, 90:10 (v:v)). The gradient consisted of initial conditions at 30% of B, before increasing to 99% of B at 18 min and maintaining this for a further 2 min before decreasing to 30% of B over 2 min, returning to initial conditions. Column conditioning consisted of 8 repeat injections of the pooled QC. Samples were analysed as technical triplicates in a randomised order with the pooled QC injected every tenth injection throughout the analysis.

Urine samples
LC-MS analysis was performed using an ACQUITY ultra performance liquid chromatography (Waters Corp, Milford, MA) coupled with a Synapt G2-S quadrupole timeof-flight (UPLC-Q-TOF) mass spectrometer (Waters Corp, Wilmslow, UK). Samples were analysed on a BEH C 18 column (1.7 µm, 2.1 × 100 mm) which was maintained at 40 °C, whilst samples were maintained at 7 °C. Analytes were separated over a 15 min gradient using a flow rate of 0.5 mL/min. Mobile phase A was water and 0.1% formic acid, and B was methanol and 0.1% formic acid. The gradient consisted of initial conditions at 1% of B for 1 min, before increasing to 15% of B over 3 min, increasing to 50% of B over 3 min, and further increasing to 95% of B over 3 min and maintaining this for a further 1 min, before decreasing to 1% of B over 5 min, returning to initial conditions. Column conditioning consisted of 8 repeat injections of the pooled QC. Samples were analysed as technical triplicates in a randomised order with the pooled QC injected every tenth injection throughout the analysis.
Cork plasma samples were analysed in December 2017 and Cork urine samples in May 2017. Auckland urine samples were analysed in July 2017. All samples were analysed on the same instrument, using a new BEH C 18 column for each experiment, following the protocols described above.

MS configuration
For both urine and plasma samples, data were acquired using the data independent acquisition (DIA) mode, MS E (Bateman et al. 2002;Silva et al. 2005) using a Synapt G2-S Q-TOF mass spectrometer (Waters Corp, Wilmslow, UK). Data were acquired in resolution mode, from 50 to 1500 Da, first in positive followed by negative electrospray ionisation modes (ESI+, and ESI−). Both precursor (low energy) and fragment (high energy) ion data were collected during the same acquisition, with 0.1 second scan time for each, and a total cycle time of 0.2 s. A linear collision energy ramp (20-40 eV) was applied for high energy, over the 0.1 second scan. Capillary voltage was set at 3.0 kV, sampling cone at 40 V, extraction cone to 5 V. The source temperature was set at 120 °C, and the desolvation temperature was 650 °C. The desolvation gas flow rate was set at 800 L/h, and cone gas at 50 L/h. Mass calibration was performed using a sodium formate mix (Waters, Wexford, Ireland), recommended by the manufacturer before each batch analysis. Real time lock mass correction was performed using a leucine enkephalin (LeuEnk, 1 ng/µL) mix, injected at 10 µL/min through a lock-spray probe and acquired every 30 s.

Data processing
Using Progenesis QI version 2.4 (Nonlinear dynamics, Newcastle, UK) was used to align and peak pick the LC-MS data using Progenesis QI automatic processing facility. An appropriate pooled QC was selected by Progenesis QI software, from all the QC samples analysed in the same analysis the automatic algorithm chose the best reference from these runs. This pooled QC was selected as the alignment reference to chromatographically align the data. Data were then peak picked, and considered the adducts correspond-

Statistical analysis
Statistical analysis of the demographics and clinical data was performed using Student T test, Mann-Whitney U test, or Pearson χ 2 test, with multiple testing corrections as appropriate (IBM SPSS Statistics 24). Results were considered statistically significant at p < 0.05 (Table 1). Before sample preparation and analysis, a block randomisation based on the patients' BMI and the outcome was performed. No significant dependency between measurement order, the outcome and biometric and clinical information about patients was observed using Mann Whitney U test, Spearman correlation, Chi square test and Kruskal-Wallis test as applicable; Benjamini and Hochberg procedure was applied for multiple testing correction (Benjamini and Hochberg 1995).
Statistical analyses of the UPLC-MS data were performed using the R statistical software(R Core Team 2013) and the Bioconductor package limma (Ritchie et al. 2015), and package ggplot2 (Wickham 2011) was used to create volcano plots. Data obtained from analyses of urine and plasma samples were subjected to the same statistical analyses performed independently. Data were acquired in positive and negative electrospray ionisation modes (ESI+, ESI−), which are known to show significant differences, and were therefore analysed independently. Median normalisation of the raw UPLC-MS data, and application of quality control procedures (Broadhurst et al. 2018) were performed. Measurement precision of each feature was checked by computing coefficient of variation (CV) and the missing rate over the replicate measurements, using pooled QC samples as reference. Features with a missing rate greater or equal to 20% and features with a CV greater or equal to 30% were filtered out. Robust multi-variable methods (Wang et al. 2018;Brereton and Lloyd 2014) were used to rank and select features. Empirical Bayes was used on each feature, and the design was adjusted for replicate measures (Smyth 2004); features analysed were ranked by adjusted p-value. In addition, the Mann Whitney U test was performed on the average measurement per patient with multiple testing correction (Benjamini and Hochberg 1995). Agreement between the two methods (similar trends and results) was shown and confirmed the robustness of the results (Data not shown). The features of interest were tested for correlation with clinical variables, using Wilcoxon-Mann-Whitney Test, Spearman Correlation Test, or Kruskal-Wallis Test as appropriate. Multiple testing correction (Benjamini and Hochberg 1995), with a false discovery rate (FDR) cut-off of 25%, was applied.

Metabolites putative annotation
In accordance with the MSI reporting standards, we have achieved metabolite identification level 2, or putatively annotated compounds (Sumner et al. 2007). For each dataset, the exact mass of significant features (adjusted p-value < 0.05) were searched against Human metabolome database (HMDB, version 4.0) (Wishart et al. 2018) and Lipidmaps (version of January 2019) (Cotter et al. 2007) using Progenesis QI identification tool. The search was performed using the theorical fragmentation MetaScope mode, which compared our experimental fragments to theorical fragmentation patterns generated by the simulated breaking of bonds in the structures of possible identifications. The search parameters were set for an exact mass tolerance of 5 ppm for the precursor ion, and of 10 ppm for the fragment ion. With UPLC-MS, it is common that metabolites are detected multiple times due to fragmentation, dimerization, chemical adduction, or multiple charging. Putative annotations were reported for unique metabolites, after removing duplicates and metabolites of drugs or metabolites originating from food, and after checking the retention time. For each database identification match, the retention time of the feature was checked to ensure it was in the appropriate time window for the chromatographic separation method used. For instance, if a feature with a retention time of 11 min matched with a Lysophosphatidylcholine (LPC), which are expected to elute early (around 1-2 min) due to their polarity, then it was assumed the feature was not an LPC. Using HMDB, the metabolites reported were grouped into chemical classes.

Global lipidomic analysis of plasma samples
Clinical and demographics data from selected Cork SGA cases (customised birthweight < 10th centile, and median 2.35, IQR 0.8-4.48, n = 40) and matched controls are presented in Table 1. Cases were matched to controls by ethnicity, BMI and age and no significant differences were observed in these parameters. Smoking status was significantly different between cases and controls, as was gestational age at delivery (mean 39.19 (SD 2.84) vs. 40.41 (SD 0.94) weeks gestation; p = 0.007).
In addition, we determined whether there was a correlation between the lipids of interest and either smoking status, or gestational age at delivery. None of the lipids of interest correlated significantly with gestational age at delivery. However, one lipid was significantly associated with maternal smoking status: CL(72:2) (p-value = 5.82 × 10 −4 with FDR = 0.06) and was excluded from our results. Box plots showing the levels of this lipid in smokers and non-smokers are shown in Supplementary File 2.

Global metabolomics analysis of urine samples
Clinical data and demographics of selected women are described in Table 1. Importantly, no significant correlations were found between the metabolites of interest and either smoking status, or gestational age at delivery, in either the Cork or Auckland populations.

Discussion and conclusions
This study has highlighted the following main findings: (i) altered glycerophospholipids and sphingolipids in plasma samples from women with SGA pregnancies, and (ii) lower levels of metabolites involved in nutrient transport and detoxification pathways in urine samples from women with SGA pregnancies. There were no common SGA-associated metabolite changes identified in plasma and urine.
Our lipidomics study shows evidence of altered glycerophospholipids (GPL) and sphingolipid pathways associated with SGA at 20 weeks' gestation in Cork plasma samples. Phospholipids are the main lipid class present in biological bilayer, such as cell membranes, and are involved in inflammation, apoptosis, and storage and breakdown of lipids for energy (Baig et al. 2013). Our findings are in agreement with a metabolomics study performed using samples from the SCOPE study taken at 15 weeks of gestation in Australian participants (Horgan et al. 2011). In the Horgan et al. study,Fig. 2 Box plots showing the normalised intensity of features significantly altered (adjusted p-value < 0.05) in small for gestational age group (cases, orange box) compared to control group (blue box), from untargeted UPLC-MS analysis of SCOPE Cork plasma samples (cases n = 40, controls n = 40). Quality control (QCs) group also shown (grey box) metabolic profiles of maternal plasma, venous cord blood, and plasma samples from a rat model of SGA (reduced uterine perfusion pressure, RUPP, model) were obtained on a UPLC system coupled to a hybrid LTQ-Orbitrap mass spectrometer. Horgan et al. identified nineteen metabolites as statistically different in SGA cases as compared to controls; these included sphingolipids and phospholipids, such as several lysophosphatidylcholines and phosphatidylcholines, most of which were detected in higher levels in SGA groups in maternal plasma samples, thereby supporting our findings of increased plasma glycerophospholipids (GPL) and sphingolipids in SGA. In addition, a recent study combining the use of direct infusion MS/MS, LC-MS/MS, proton nuclear magnetic resonance (NMR) and artificial intelligence showed alteration of several pathways in cord blood samples, including phospholipid biosynthesis and fatty acid metabolism, to be associated with SGA (Bahado-Singh et al. 2019). Other pregnancy complications have been associated with altered lipid levels when compared to noncomplicated pregnancy, including spontaneous preterm birth (sPTB) (Morillon et al. 2020), recurrent miscarriage and pre-eclampsia (Baig et al. 2013). An animal model of pregnancy loss, where sphingosine kinase, an enzyme part of the sphingolipid pathway, was inactivated showed an increased rate of early pregnancy loss compared to wild type mice (Mizugishi et al. 2007). This further shows the critical role of lipids in the pathophysiology of pregnancy complications.
Our metabolomics study of urine samples taken at 20 weeks' gestation, showed that the metabolites altered in SGA were D-Glucuronic acid in Auckland samples, and Estriol-16-glucuronide in Cork samples both of which are involved in detoxification, and Sulfolithocholic acid is implicated in nutrient absorption and transport. None of the significantly different metabolites of interest were found in both the Cork and Auckland populations, however, several are involved in similar pathways and cellular processes. This may be attributed to the fact that this independent study was run months apart, on a different BEH C 18 column therefore chromatographic reproducibility was not minimised. Future projects will take a more targeted approach to metabolomics validation given the inherent problems of chromatographic reproducibility.
Overall, the lipidomics and metabolomics analyses performed on samples taken from Cork women suggested placental insufficiency and inadequate transport of lipids to the placenta, resulting in impaired fetal growth (Zhang et al. 2015) detected as early as 20 weeks of gestation. Indeed, the placenta plays a key role in the development of the fetus in utero, as it ensures the fetus receives sufficient nutrientsespecially oxygen amino acid, glucose and fatty acids (Lager and Powell 2012).
Our study had limitations; one of them was the number of smoking women in Cork population (30% in the SGA group, and 10% in the control group), and another was the significant difference of gestational age at delivery between SGA group and controls in Cork population. This latter difference could be attributed to the fact that the participants selected for the SGA group delivered babies with extremely low customised birthweight centile (median of 2.35, IQR 0.8-4.48) compared to the control  ). Cases and controls were matched for age, ethnicity and BMI when the studies were designed, and exclusion of participants would have reduced the power of this study. In addition, using all known risk factors associated with SGA to adjust the data during statistical analysis would have led to overfitting the data. We did, however, use a stringent false discovery rate cut-off at 5% to select metabolites and lipids of interest, and we further tested the metabolites and lipids of interest to determine if they were significantly correlated with clinical factors (in addition to SGA). No significant correlation was observed with any metabolite of interest and gestational age at delivery, or smoking status, and just one lipid of interest, CL(72:2), was found to be significantly correlated with smoking status. However, CL(72:2)was also significantly correlated with SGA and further analysis to decipher the biological link between CL(72:2) and smoking in preclinical models is warranted to rule out this confounding result. Another limitation of our lipidomics study is that no lipid standards were used to confirm the identities, however library search using accurate mass and fragment ion was performed, thus achieving metabolite identification level 2 according to the MSI reporting standards (Sumner et al. 2007).
In conclusion, this study showed that higher levels of glycerophospholipids and sphingolipids at 20 weeks of gestation, is associated with the onset of SGA in participants of the SCOPE study in Cork. However, whether the correlations represent a cause, or an effect of SGA needs to be further investigated. Further studies are needed to validate these findings in an independent pregnancy cohort and to examine whether there may be the potential to use these lipids to predict pregnancies at risk of SGA.