Metabolic alterations in immune cells associate with progression to type 1 diabetes

Aims/hypothesis Previous metabolomics studies suggest that type 1 diabetes is preceded by specific metabolic disturbances. The aim of this study was to investigate whether distinct metabolic patterns occur in peripheral blood mononuclear cells (PBMCs) of children who later develop pancreatic beta cell autoimmunity or overt type 1 diabetes. Methods In a longitudinal cohort setting, PBMC metabolomic analysis was applied in children who (1) progressed to type 1 diabetes (PT1D, n = 34), (2) seroconverted to ≥1 islet autoantibody without progressing to type 1 diabetes (P1Ab, n = 27) or (3) remained autoantibody negative during follow-up (CTRL, n = 10). Results During the first year of life, levels of most lipids and polar metabolites were lower in the PT1D and P1Ab groups compared with the CTRL group. Pathway over-representation analysis suggested alanine, aspartate, glutamate, glycerophospholipid and sphingolipid metabolism were over-represented in PT1D. Genome-scale metabolic models of PBMCs during type 1 diabetes progression were developed by using publicly available transcriptomics data and constrained with metabolomics data from our study. Metabolic modelling confirmed altered ceramide pathways, known to play an important role in immune regulation, as specifically associated with type 1 diabetes progression. Conclusions/interpretation Our data suggest that systemic dysregulation of lipid metabolism, as observed in plasma, may impact the metabolism and function of immune cells during progression to overt type 1 diabetes. Data availability The GEMs for PBMCs have been submitted to BioModels (www.ebi.ac.uk/biomodels/), under accession number MODEL1905270001. The metabolomics datasets and the clinical metadata generated in this study were submitted to MetaboLights (https://www.ebi.ac.uk/metabolights/), under accession number MTBLS1015. Electronic supplementary material The online version of this article (10.1007/s00125-020-05107-6) contains peer-reviewed but unedited supplementary material, which is available to authorised users.

The ultra-high performance liquid chromatography-quadrupole time-of-flight mass spectrometry (UHPLC-Q-TOF-MS) analyses were done in a similar manner to that described earlier, with some modifications [3,4]. The UHPLC-Q-TOF-MS system was from Agilent Technologies (Santa Clara, CA, USA) combining a 1290 Infinity LC system and 6545 quadrupole timeof-flight mass spectrometer (Q-TOF-MS), interfaced with a dual jet stream electrospray (dual ESI) ion source. MassHunter B.06.01 software (Agilent Technologies, Santa Clara, CA, USA) was used for all data acquisition and MZmine 2 was used for data processing [5].
Chromatographic separation was performed using an Acquity UPLC® BEH C18 column (100 mm × 2.1 mm i.e., 1.7 µm particle size) and protected using a C18 precolumn, both from Waters Corporation (Wexford, Ireland). The injection volume was 1 µL. The mobile phases were water and acetonitrile:2-propanol (1:1, v/v), both containing 1% 1M ammonium acetate and 0.1% (v/v) formic acid as ionization agents. The LC pump was programmed at a flow rate of 0.4 mL min -1 and the elution gradient was as follows: from min 0-2, the percentage of phase organic phase was modified from 35% to 80%, from min 2-7, the percentage of organic phase was modified from 80-100% and then, the final percentage was held for 7 min. A post-time of 7 min was used to regain the initial conditions for the next analysis. Thus, the total analysis time per sample was 21 min. The column and the eluents were kept at 50 °C and the multisampler was kept at 10°C. The settings of the dual ESI ion source were as follows: capillary voltage 4.5 kV, nozzle voltage 1500 V, N 2 pressure in the nebulizer 21 psi, N 2 flow rate and temperature as sheath gas 11 L min -1 and 379 °C, respectively. Accurate mass spectra in the MS scan were acquired in the m/z range 100-1700 in positive ion mode, and in range 100-1700 in the negative mode for the MS/MS runs. In the ESI-mode, capillary voltage 3.6 kV, nozzle voltage 1500 V, N 2 pressure in the nebulizer 21 psi, N 2 flow rate and temperature as sheath gas 11 L min -1 and 379 °C, respectively. The injector system was washed with 10% DCM in MeOH and ACN:MeOH:IPA:H2O (1:1:1:1, v/v/v/v) + 0.1% HCOOH as needle wash solutions after each injection for 7.5s each.
Identification of lipids was carried out by combining MS (and retention time), MS/MS information, and a search of the LIPID MAPS spectral database (http://www.lipidmaps.org). MS/MS data were acquired in both negative and positive ion modes in order to maximize identification coverage. The confirmation of a lipid's structure requires the identification of hydrocarbon chains bound to its polar moieties, and this was possible in some cases. This identification was carried out in pooled cell extracts, and with this information, an in-house database was created with m/z and retention time for each lipid. This in-house database was used for processing data by MZmine 2 [5]. Glycoceramides were identified based on their accurate mass, MS/MS analysis and retention times. Further verification of majority of the ceramides was possible with authentic standards.

Analysis of polar metabolites
To the remaining 75 µL aliquot, 225 µL of ice-cold MeOH (LC-grade, Honeywell) containing the following internal standards (all from Sigma Aldrich): Heptadecanoic acid (5 ppm), DL-valine-d8 (1 ppm), and succinic acid-d4 (1 ppm) was added. The samples were then sonicated in an ice bath for 30 s prior to centrifugation (5500 g, 5 min). 250 µL of the supernatant was transferred to a 2 mL glass autosampler vial. The pellet was stored at -20 °C for protein analysis. The protein content was measured by the Bradford method. The supernatant was dried under a stream of nitrogen at 45 °C. Prior to the mass spectrometry measurements, the samples were derivatized using a two-step procedure. Initially the samples were methoximated by incubating the samples with methoxyamine hydrochloride (25 µL, 20mg/mL in pyridine, Sigma Aldrich) at 45 °C for 1 h. MSTFA (25 µL, Sigma Aldrich) was then added and the samples were incubated for a further 1 h. A retention index standard containing straight chain, even alkanes (n 10-40, 10 µL, Sigma Aldrich) was added. The derivatized samples were analyzed using gas chromatography (Agilent 7890B) coupled to a single quad mass spectrometer (5977B). The metabolites were separated using a 30 m × 0.25 mm (ID) with a film thickness of 0.25 µm HP-5 (Agilent). A guard column (10 m) with an ID of 0.25 mm was used. 1 µL of the sample was injected in splitless mode with an inert glass liner (Agilent) held at a temperature of 240 °C. The GC was set to constant flow mode (1.2 mL/min) using helium (Aga) as the carrier gas. The GC oven was programed as follows: 50 °C (isothermal for 0.2 min), then 7 °C/min until 240 °C, then 20 °C/min until 300 °C (isothermal for 5 min). The transfer line was held at 260 °C for the whole run. The ion source was set to electron ionization mode and held at 230 °C and the quadrupole at 150 °C. Due to the large number of analytes quantified, the samples were injected twice. The first run quantified the amino acids and the second run quantified all other components. The MSD was set up in select ion monitoring mode to maximize sensitivity. The ions monitored can be found in (ESM Table 2).

Data preprocessing
Lipidomics data processing was performed using open source software MZmine 2.33 [5]. The following steps were applied in the processing: 1) Crop filtering with a m/z range of 350 -1700 m/z and a RT range of 2.0 to 12 min, 2) Mass detection with a noise level of 1200, 3) Chromatogram builder with a min time span of 0.08 min, min height of 1000 and a m/z tolerance of 0.006 m/z or 10.0 ppm, 4) Chromatogram deconvolution using the local minimum search algorithm with a 70% chromatographic threshold, 0.05 min minimum RT range, 5% minimum relative height, 1200 minimum absolute height, a minimum ratio of peak top/edge of 1 and a peak duration range of 0.08 -5.0, 5) Isotopic peak grouper with a m/z tolerance of 5.0 ppm, RT tolerance of 0.05 min, maximum charge of 2 and with the most intense isotope set as the representative isotope, 6) Join aligner with a m/z tolerance of 0.008 or 10.0 ppm and a weight for of 2, a RT tolerance of 0.1 min and a weight of 1 and with no requirement of charge state or ID and no comparison of isotope pattern, 7) Peak list row filter with a minimum of 12 peaks in a row (= 10% of the samples), 8) Gap filling using the same RT and m/z range gap filler algorithm with an m/z tolerance of 0.006 m/z or 10.0 ppm, 9) Identification of lipids using a custom database search with an m/z tolerance of 0.006 m/z or 10.0 ppm and a RT tolerance of 0.1 min, 10) Normalization using lipid-class-specific internal standards and (semi) quantitation with lipid-class-specific calibration curves, 11) Normalization with total protein amount 12) Data imputation of missing values were done with half of the row's minimum.
The GC-QMS data was processed in MassHunter Quant (v8, Agilent technologies) The peaks were manually checked and corrected if needed for correct integration. Quantification was performed using the ion listed (ESM Table 2). Standard curves were used to quantify each metabolite using the assigned internal standards. Metabolites which had a CV greater than 30% in the pooled QC sample or fell below the limit of quantification were excluded from subsequent analysis.

Statistical methods
Homogeneity of the samples were assessed by principal component analysis (PCA) which was performed using 'prcomp' function included in the 'stats' package of R. The scores of the observation falling outside the 95% confidence interval were considered as outlier.
The effect of different factors such as age, gender, conditions and their interactions on the lipidomics dataset was evaluated. The data was centered to zero mean and unit variance. The relative contribution of each factor to the total variance in the dataset was estimated by fitting a linear regression model, where the normalized intensities of metabolites were regressed to the factor of interest, and thereby median marginal coefficients (R 2 ) were estimated (ESM Fig. 4). This analysis was performed using 'scater' package.
Sparse Partial least squares Discriminant Analysis (sPLS-DA) [6] models comparing P1Ab vs. CTRL, PT1D vs. CTRL, and PT1D vs. P1Ab groups, paired at 12, 24 and 36 months were developed and Variable Importance in Projection (VIP) scores [7] of the features/metabolites were estimated. sPLS-DA modeling was performed using the 'splsda' function coded in the 'mixOmics v6.3.2' package. sPLS-DA models were cross-validated [8] by 7-fold cross-validation and models diagnostics were generated using 'perf' function. The multivariate analysis was followed by univariate; Two-Sample t-testing using the 't.test' function was applied to compare the mean differences in the metabolite intensities between P1Ab vs. CTRL, PT1D vs. CTRL, and PT1D vs. P1Ab groups. Together, multi-and univariate analysis was used for selection of metabolites altered between these subgroups at a particular age. All metabolites that passed one or more criteria for variable selection, i.e., with sPLS-DA model area under the ROC curve (AUC) >= 0.65; RC (>± 0.05), VIP scores > 1 and/or (T-test; p-value < 0.05) were listed as altered. Multiple testing was performed and the nominal p-values (T-test) were subjected to FDR correction. The features/metabolites did not pass the threshold (FDR corrected p-values < 0.05) which might be due to small sample size of these subgroups, analyzed at a particular time-point.
Spearman correlation was applied to identify association between plasma and cellular metabolite levels in CTRL, P1Ab and PT1D groups. Spearman's correlation coefficient (r) was calculated using the 'rcorr' function implemented in the 'Hmisc' package. P-values were subjected to False Discovery Rates (FDR) adjustment using 'p-adjust'. Loess regression was used for the interpolation of the metabolite intensities along 12, 24 and 36 months of age. It was performed using 'loess' function deployed in the 'stats' package.
Pathway overrepresentation analysis (POA) was performed using the MetaboAnalyst 4.0 web platform [9] using the 'Pathway Analysis' module. Those metabolites altered between different subgroups were listed and mapped to the human metabolic network as a background; 'relative-betweenness Centrality' was selected for 'pathway topology analysis', and global hypergeometric test (GHT) was performed. GHT estimated the relative significance of the overrepresented pathways against the background KEGG pathways [10] for Homo sapiens. The metabolic subsystems/pathways with (FDR < 0.05) is reported in (Fig. 4). The Pathway Impact Scores (PIS) were estimated by the metabolomics pathway analysis (MetPA) tool [11] encoded in MetaboAnalyst 4.0 [9].

Meta-analysis of transcriptomics datasets and genome-scale metabolic modeling
Genome-scale metabolic modelling (GSMM) is a constraint-based mathematical modelling approach that integrates biochemical, genetic and genomic informations within a computational framework [12][13][14][15]. It is used to study metabolic genotype-phenotype relationship of an organism. They are efficient tools for prediction of growth in living cells/tissues exposed to different nutrients [16,17]. The structure of genome-scale model (GEM) provides scaffolds for integration of different types of omics data such as transcriptome, proteome and metabolome/fluxome [18]. Several algorithms were designed that allow integration and contextualization of GEMs (i.e. to constrained the metabolic reaction bounds with the experimental data or conditions), based on expression datasets [19].
The feasibility of a particular metabolic reaction(s) to be included or discarded in the draft GEM model was evaluated. The polar metabolites and the lipid intensities from this study were used to estimate the confidence score of a metabolite in a reaction [20], that is either included or discarded in the draft model. By applying this strategy, condition-specific PBMC models for PT1D, P1Ab and CTRL were developed. Quality control (QC) or sanity checks were performed on the draft models [21][22][23]. In addition, the models were tested for their ability to carry out basic metabolic tasks [24,25]. The blocked reactions were removed before simulations.  Effects of different factors such as age, gender and study group status on lipidomics data were evaluated. The data were centered to zero mean and unit variance. The relative contribution of each factor (experimental variable) to the total variance in the dataset was estimated by fitting the linear model regression model, where the normalized intensities of metabolites regressed to the factor of interest, and thereby estimating the median marginal coefficients (R 2 ). This analysis was performed using package 'scater'. ESM Fig.6 Log intensities of the total lipids before (light blue) and after (light gray) the seroconversion (SC), in P1Ab and PT1D groups. Down-regulation in the intensities of the total lipids in the PBMCs were observed after seroconversion (SC), in P1Ab (median age of SC, 24 months) (p=5.581e-05) and PT1D (median age of SC, 14 months) (p=9.803-06).
ESM Fig.7 Selected profiles of polar metabolites that were altered in Ctrl, P1Ab, and PT1D groups, during the 36-months follow-up. (A-F) The log mean intensities of palmitic acid, aspartic acid, citric acid, myristic acid, phenylalanine, and serine are shown along the age (months). Loess regression was used for the interpolation of the data points.

Figure.S10
ESM Fig.10 Genome-scale metabolic models for PBMCs. The total numbers of genes, metabolites and reactions included in the PBMC models for Ctrl, P1Ab and PT1D, developed in the study is shown.
ESM Fig.13 Flux enrichments in the metabolic subsystems. Stacked bar plots showing cumulative (-log 10 ) q-values of the enriched subsystems. Enrichments of metabolic subsystems in PBMCs models for T1D progressors and nonprogressors when, (A) production of glucosylceramide is maximized, (B) production of digalactosylceramide from D-galactosyl-N-acylsphingosine is maximized.