Background

Metabolism is a complex, interconnected and finely regulated network. It is composed of reactions biochemical processes that transform endogenous or exogenous substrates into vital products for cell, tissue and organism function. As a result, deregulation of this homeostasis underlies the pathophysiological mechanisms of different diseases [1]. An alteration of a metabolic pathway may be related to nutritional, environmental, or genetic factors. Inborn errors of metabolism (IEM) are rare diseases mainly due to a genetic defect enzymes or cofactors involved in a metabolic pathway or in intra- or intercellular metabolites. For better management of IEM patients, rapid and accurate biochemical and molecular tests are needed. Omics are very appealing to speed up both their molecular understanding and may lead to more efficient biomarkers. Omics are very appealing to achieve holistic and systemic aspects of diseases [1]. The metabolome refers to all metabolites present in a given biological system [2]. Metabolomics is an “omics” technology that allows metabolome characterization [3, 4]. Metabolomics is particularly interesting in exploring IEM given their intrinsic with metabolism [5]. Lysosomal storage diseases (LSD) represent a group of about 50 inherited disorders related to deficient lysosomal proteins. This impairment leads to a progressive accumulation of metabolites or macromolecules within the lysosomales. This storage causes, at least partly, various organ failures [6]. Mucopolysaccharidoses (MPS) are a subgroup of LSD. They are related to impaired catabolism of glycosaminoglycans (GAGs), chondroitin sulfate (CS), dermatan sulfate (DS), heparan sulfate (HS), keratan sulfate (KS), and hyaluronan, leading to GAG accumulation in the lysosomes and extracellular matrix [7]. This accumulation leads to multiple progressive tissue and organ failures [8]. Seven distinct forms of MPS are described and related to 11 known enzyme deficiencies [6]. Overall incidence is more than 1 in 30,000 live births [9]. Most MPS patients are asymptomatic after birth, but prenatal symptoms may be observed in MPS I, MPS IVA, MPS VI and more frequently in MPS VII. Depending on the patient and the MPS subtype, symptoms and severity may vary. Different MPS treatments are either in clinical use or under clinical trials [10]. Mucopolysaccharidosis type III (MPS III), or Sanfilippo syndrome, is caused by a congenital deficiency of one of the four enzymes involved in the degradation of HS [11]. Four subtypes, MPS IIIA, MPS IIIB, MPS IIIC and MPS IIID, have an autosomal recessive inheritance [12]. Typically, patients with Sanfilippo disease present with no obvious clinical features prior to age 1–3 age years. Growth parameters may be higher compared to the reference range in the first years of life, while a growth delay may be observed in older patients. In all MPS III subtypes, central nervous system (CNS) involvement predominates (neurodegeneration, hyperactivity and behavioral disturbances) with less pronounced skeletal abnormalities and organomegaly. In addition, the clinical picture includes hirsutism, coarse facial features, cardiomegaly, thick hair, cloudy cornea, recurrent diarrhea, otitis and dysarthria [12,13,14,15]. MPS IIIA is the most severe type with an earlier onset and a rapid neurological deterioration. The first signs occur at around 1–3 years of age and the clinical symptoms worsen gradually and inevitably, resulting in the onset of severe dementia and a complete loss of motor functions. As other inherited metabolic diseases, the symptoms show high variability among patients even within the same family. Patients usually die before the third decade of life, although patients with a mild phenotype and allelic heterogeneity have been reported [12,13,14,15]. MPS IIIA (OMIM #252900) is caused by Heparane-N-sulfatase (SGSH, EC 3.10.1.1) deficiency with an incidence of 1 in 100,000 [16, 17]. MPS IIIB (OMIM #252920) is due to N-acetyl-α-glucosaminidase (NAGLU, EC 3.2.1.50) deficiency with an incidence of 1 in 200,000 [18]. MPS IIIC (OMIM #252930) is caused by heparan acetylCoA: α-glucosaminide-N-acetyltransferase (HGSNAT, EC 2.3.1.3) deficiency with an incidence of 1 in 1,500,000 [19]. MPS IIID (OMIM #252940) is due to N-acetylglucosamine-6-sulfatase (GNS, EC 3.1.6.14) deficiency with an incidence of 1 in 1,000,000 [20]. So far, no specific approved treatment is available. Gene therapy [21], bone marrow transplant [22], chaperon molecules [23], substrate deprivation therapy [24] and intrathecal enzyme therapy [25] are among the most active therapeutic research areas. The goal of this work is to apply both targeted and untargeted metabolomics on MPS IIIA, MPS IIIB, MPS IIIC and MPS IIID patients, compared to controls, to investigate metabolic changes in these conditions.

Methods

Urine samples

Random urine samples were collected from patients with a confirmed MPS diagnosis. Urine samples were collected within five expert centers for inherited metabolic diseases in France. The 49 untreated MPS III patients were evaluated as follows: 13 MPS IIIA patients: 6 males (age range from 5.1 to 12.0 years, mean age 6.2 years) and 7 females (age range from 1.9 to 18.4 years, mean age: 6.8 years); 16 MPS IIIB patients: 7 males (age range from 3.8 to 9.8 years, mean age 7.2 years) and 9 females (age range from 2.9 to 11.7 years, mean age 6.3 years); 13 MPS IIIC patients: 7 males (age range from 6.4 to 20.6 years, mean age:12.1 years) and 6 females (age range from 2.8 to 31.1 years, mean age 10.0 years); 7 MPS IIID patients: 3 males (age range from 3.8 to 17.5 years, mean age 8.9 years) and 4 females (age range from 3.4 to 18.7 years, mean age 7.8 years). Moreover, control urine samples were also collected from 66 healthy subjects, 27 males and 39 females (age range from 5.5 to 70 years, mean age 40.8 years). This project was approved by the Research Ethics Board of Rouen University Hospital (CERNI E2016-21).

Metabolic phenotyping

The protocol used in this study has been previously described [26]. Briefly, urine samples were processed by transferring 200 μL of urine to 1.5 mL tubes and centrifuging at 4 °C for 10 min at 13,000g; then 100 µL of ultrapure water was added to 100 µL of supernatant and mixed. For untargeted metabolomics data acquisition, Ultraperformance liquid chromatography–ion mobility mass spectrometry on a Synapt G2 HDMS (Waters, Saint-Quentin-en-Yvelines, France) mass spectrometer as previously described [27]. Regarding targeted analysis, free amino acid profiles in urine was based on liquid chromatography coupled to a tandem mass spectrometry. Detailed protocols are presented in the Additional file 1.

Data analysis

A one-way analysis of variance (ANOVA) test was applied for multiple groups testing while a t test is used for binary comparisons. The Benjamini and Hochberg false discovery rate (FDR) method was used for multiple testing corrections with an FDR cut-off level of 5%. A Receiver operating characteristic curve (ROC) has been used to assess the diagnostic performance of the chosen classifiers. Support vector regression normalization for untargeted metabolomics data [28]. The normalized data has been log-transformed and pareto-scaled. All data modeling and analysis is done using SIMCA 15.0 (MKS DAS, Umeå, Sweden) and R software. The Mummichog algorithm has been used for pathway analysis [29] while MetaboAnalyst has been used for Metabolite Set Enrichment Analysis on the amino acid data [30]. Details regarding data modeling and validation results from all models are provided in Additional file 1. Figure 1 presents an overview of the implemented metabolomics workflow.

Fig. 1
figure 1

Illustration of the experimental workflow spanning from experimental design and data acquisition to pathway analysis and biological interpretation

Results

Untargeted analysis

The heatmap in Fig. 2a depicts the top 100 features ranked by ANOVA (p < 0.05 cut-off and FDR 5%). The results highlight correct clustering of the different sample groups and the dendrogram structure, using Euclidean distance, shows two main clusters of variables. We applied principal component analysis (PCA) to further analyze the underlying differential metabolic profiles. A three-component PCA model accounting for 18% of the total variance has been built. Trends, groups and potential outliers within the data are investigated using score plots. For predictive classification purposes, supervised methods are used since they allow the accurate modeling of the relationship between controls, MPS IIIA, IIIB, IIIC and IIID samples. First, an OPLS-DA classification was applied to the whole dataset. Samples were labeled according to the corresponding groups, MPS IIIA, IIIB, IIIC, IIID and control (Fig. 2b, c). A negative Q2 regression line intercept resulting from the permutation test allows the cross-validation of OPLSDA models. The final model had an R2 = 0.77 and Q2 = 0.13. The OPLS-DA scores plots (Fig. 2b) shows a clear separation, suggesting that the OPLS-DA model successfully classified samples according to their respective metabolic profiles. Model validation is assessed both by CV-ANOVA (p-value = 3 × 10−2) and by the permutation test (999 permutations gave a negative Q2 intercept). Details regarding model validation are shown in Additional file 1: Fig. S3). Furthermore, separate binary OPLS-DA classification models have been built for each disease sample vs control. For control and MPS IIIA samples the model had one predictive and two orthogonal components, and its validation parameters were as follows: R2 = 0.89, Q2 = 0.23 and CV-ANOVA p-value = 2.84 × 10−3 (Additional file 1: Fig. S4). The corresponding score plot is shown in Fig. 2d. It shows a clear separation between the two classes on the predictive component. For MPSIIIB and control samples the model had one predictive and two orthogonal components model with R2 = 0.89, Q2 = 0.21 and CV-ANOVA p-value = 5.28 × 10−3 (Fig. 2e). For MPSIIIC and control samples, the model has one predictive and three orthogonal components with R2 = 0.98, Q2 = 0.39 and CV-ANOVA p-value = 1.35 × 10−5 (Fig. 2f). Another OPLS-DA model was built for MPSIIID and control samples with one predictive and two orthogonal components model with R2 = 0.95, Q2 = 0.36 and CV-ANOVA p-value = 1.83 × 10−5 (Fig. 2g). To select discriminant variables, their respective VIP scores for each validated OPLS-DA model have been used. Based on 1 as a cutoff value, 25 features out of 854 were selected for the MPSIIIA VS Control model, 243 for MPSIIIB vs control, 247 for MPS IIIC vs control and 262 for the MPSIIID vs control model. The variables lists have been refined by retaining only the most discriminant variables and their putative annotation. The list included N-acetylserotonin, N-succinyl-l,l-2,6-diaminopimelate, octanoylglucuronide and 3-2-hydroxyphenyl-propanoic acid. These discriminant variables are depicted in Tables 1 and 2 with their respective statistical metrics and annotation. Boxplots of the main discriminant features are presented in Additional file 1: Fig. S7. Using the area under the ROC curves (AUC), the discriminant performances of these features are also investigated. N-Acetylserotonin has the highest AUC for MPS IIIA (AUC = 0.83) and MPS IIIB (AUC = 0.83). N-Succinyl-l,l-2,6-diaminopimelate has the highest AUC (0.73) for MPS IIIC and octanoylglucuronide performed best for MPS IIID with an AUC = 0.79. The results are shown in Tables 1 and 2. Furthermore, the underlying impaired pathways in each disease are explored using Mummichog. The results are shown in Table 3. Interestingly, amino acid metabolisms and fatty acid pathways were markedly dysregulated.

Fig. 2
figure 2

a Hierarchical cluster analysis and heat map visualization of top 100 variables (x-axis) ranked by ANOVA. The urine sample classes are represented along the y-axis. The color code was used to represent log-scaled intensities of features between − 5 (blue) and + 5 (brown), showing the relative abundance of the features according to the groups. b OPLSDA scores plot (R2 = 0.77, Q2 = 0.13) shows a clear separation between the different diseased and control groups (MPSIIIA, MPSIIIB, MPSIIIC and MPSIIID and control). c OPLSDA scores plot (R2 = 0.93, Q2 = 0.05) shows a clear separation between the different diseased groups (MPSIIIA, MPSIIIB, MPSIIIC and MPSIIIC). d Clear separation between MPSIIIA and control samples is observed (R2 = 0.89, Q2 = 0.23). e Clear separation of MPSIIIB samples from the controls is observed (R2 = 0.89, Q2 = 0.21). f Clear separation of MPSIIIC samples from the controls is observed (R2 = 0.98, Q2 = 0.39). g Clear separation of MPSIIID samples from the controls is observed (R2 = 0.95, Q2 = 0.36). Detailed model characteristics and validation are given in Additional file 1

Table 1 Some discriminant features, putatively annotated, extracted by the different OPLS-DA models for MPSIIIA, MPSIIIB, MPSIIIC and MPSIIID
Table 2 Statistical and discriminant metrics of the selected annotated features
Table 3 Significantly dysregulated pathways

Targeted analysis

We also quantified twenty-four amino acids and Additional file 1: Table S4 presents their absolute urine concentrations. Boxplots of normalized amino acid concentrations are shown in Additional file 1: Fig. S8 and the statistical metrics are presented in Table 4. MPS IIIA yielded 11 significantly changed amino acids compared to controls: arginine, aspartic acid, alanine, threonine, histidine, phenylalanine, glycine, proline, asparagine and tyrosine. For MPS IIIB vs control, arginine, aspartic acid, alanine, threonine, histidine, phenylalanine, glycine, proline, glutamine, asparagine, tyrosine and leucine showed significant differences. regarding MPS IIIC vs control, 6 amino acids showed differences: arginine, aspartic acid, serine, isoleucine, methionine and citrulline. For MPS IIID vs control, 6 amino acids showed differences: arginine, alanine, threonine, glycine, glutamine and citrulline. To holistically determine the amino acid profile differences between controls and each of the MPS III subtype patients, the amino acids concentrations were assessed using an ANOVA test. The analysis yielded 17 amino acids above the p < 0.05 cut-off (FDR 5%). A hierarchical clustering analysis was applied to group samples according to their profile similarities. The heatmap in Fig. 3a represents the 24 amino acids ranked by ANOVA. The results show that all samples belonging to the same group were correctly clustered. The dendrogram structure, using Euclidean distance, highlights two main clusters of variables. Furthermore, a correlation analysis has been performed. Figure 3b–e presents the heatmap of the correlation analysis for MPS IIIA, MPS IIIB, MPS IIIC and MPS IIID, respectively. Both figures show a clear cluster of variables that have high correlation. Figure 3b (MPS IIIA vs control) shows a main cluster including alanine, leucine, valine, glycine, tyrosine, threonine, isoleucine, histidine, lysine, tryptophan, serine, asparagine, glutamine, phenylalanine, cystine and methionine. Regarding MPS IIIB vs control, Fig. 3c shows two main clusters: the main one includes methionine, isoleucine, serine, cystine, lysine, histidine, asparagine, glutamine, threonine, tyrosine, glycine, alanine, leucine, valine, phenylalanine, and tryptophan. Regarding MPS IIIC vs control, Fig. 3d shows three clusters: the main one includes cystine, lysine, histidine, glycine, alanine, methionine, isoleucine, serine, glutamine, tryptophan, tyrosine, leucine, valine, asparagine, phenylalanine and threonine. For MPS IIID vs control, Fig. 3e shows two clusters: the main one includes methionine, isoleucine, serine, cystine, lysine, glycine, histidine, tryptophan, leucine, valine, leucine, valine, tyrosine, phenylalanine, asparagine, glutamine and threonine. To assess the diagnostic performance of the different amino acids, we performed univariate ROC curve analyses for the different MPS III subtype compared to controls. For MPS IIIA, there were four amino acids with a high AUC above 0.80, including: arginine (0.98), aspartic acid (0.95), alanine (0.85) and threonine (0.81). The same procedure was performed for MPS IIIB vs control and indicated seven amino acids with a high AUC above 0.80 and these were: arginine (0.98), aspartic acid (0.94), Alanine (0.87) and threonine (0.86), histidine (0.81), glutamine (0.87), asparagine (0.83). For MPS IIIC, the results showed only Arginine with a high AUC (0.95). Regarding MPS IIID, three amino acids showed a high AUC: arginine (0.98), alanine (0.81) and glycine (0.81). The overall univariate and ROC analysis results are shown in Table 4 and Fig. 4. The ROC curves along with a comparison of the different combinations of the main significant amino acids have been performed using PLSDA models with three components each. The results are presented in Additional file 1: Fig. S9. Pathway analysis yielded the main impaired metabolisms. For MPS IIIA vs control and MPS IIIB vs control analyses, beta-alanine metabolism, malate–aspartate shuttle, arginine–proline, urea cycle and aspartate metabolism were among the most affected pathways. For the MPS IIIC vs control analysis, methionine metabolism, in addition to the abovementioned metabolic pathways, was the most affected. For the MPS IIID vs control analysis, urea cycle, arginine–proline metabolism, porphyrin metabolism and pyrimidine and purine metabolism were the most affected pathways. The overall results are shown in Fig. 5a–d for all the studied groups.

Table 4 Fold change, t-test statistics, and area under the curve (AUC) of the receiver operating curves (ROC) for 24 amino acids, free carnitine and acylcarnitines (p < 0.05)
Fig. 3
figure 3

a Heat map representing the clustering of 24 amino acids across the five groups of samples (MPS IIIA, MPS IIIB, MPS IIIC, MPS IIID and Controls). Columns represent individual samples and rows refer to amino acid. Shades of green or red represent elevation or decrease, respectively, of an amino acid. be Spearman rank-order correlation matrix 24 amino acids based on their concentrations profiles across all samples in MPS IIIA, MPS IIIB, MPS IIIC and MPS IIID respectively. Shades of green to red represent low-to-high correlation coefficient between markers

Fig. 4
figure 4

Circular plot of the 24 amino acids and their related −log (p) values in the different studies MPS III groups. Segments are color-coded according to amino acids and ribbon size represents −log (p) values (large ribbons mean low p-values). Corresponding p-values are presented in Table 4

Fig. 5
figure 5

Metabolite Set Enrichment Analysis using amino acid concentrations. a MPS IIIA vs Control. b MPS IIIB vs Control. c MPS IIIC vs Control. d MPS IIID vs Control. e Venn diagram of the significant pathways retrieved from experimental metabolomics data and in silico systems biology approach from Salazar et al. [37]. The diagram shows two common metabolisms: arginine–proline metabolism and urea cycle. Detailed pathway information is given in Additional file 1: Table S6

Discussion

In this study, MPS III urine patterns of metabolites have been studied to unveil the biochemical indicators that may differentiate MPS III patients from control individuals. Of note, the mean-age difference between the studied groups represent a drawback which is mainly due to pediatric recruitment difficulties and ethical considerations. However, given the stringency of the statistical cut-off and the applied multiple testing correction might circumvent some of these biases. Using untargeted metabolomics, we succeeded in building a predictive model with a clear separation between the different studied groups—MPS IIIA, MPS IIIB, MPS IIIC, MPS IIID and control sample—which is underlined by the metabolic pattern similarity in each group. The retrieved data revealed a profound metabolic modeling mainly of amino acid-related metabolism. In light of these results, targeted amino acid analysis has been performed, which confirmed the deep metabolic alterations. Using these data, pathway analysis succeeded in identifying the main disrupted pathways. Salazar et al. [31] reported a genome-scale human metabolic reconstruction based approach to understand the effect of metabolism alterations in MPS. This in silico approach applied to MPS III subtypes (MPS IIIA, MPS IIIB, MPS IIIC and MPS IIID) by silencing, respectively, SGSH, NAGLU, HGSNAT and GNS genes, allowed the generation of models which were analyzed through flux balance and variability analysis. We performed a comparative analysis between the in silico systems based analysis data and the pathway analysis results of this present study. This comparison is illustrated by a Venn diagram (Fig. 5e) and showed two main common metabolisms: arginine–proline metabolism and urea cycle. Detailed data are presented in Additional file 1: Table S4. Arginine–proline metabolism and it connections to urea cycle is depicted in Additional file 1: Fig. S10.

As observed in MPS I patients [26] the arginine metabolism is the most altered pathway, aspartic acid is highly elevated in MPS IIIA and IIIB, significantly elevated in IIIC and shows a rising tendency in IIID. These metabolisms (arginine–proline, urea cycle, aspartic acid) have been reported to be upregulated along with high autophagic activity upon oxygen and glucose reduction using cultured fibroblasts [32] which is consistent with their involvement in bioenergetic balance. As in other LSDs, arginine metabolism may be challenged in MPS III due to lysosome dysfunction and its subsequent autophagic block [33]. Aspartic-acid contributes to the synthesis of N-acetyl-l-aspartate (NAA) and its derivative N-acetylaspartylglutamate (NAAG). NAA plays a central role in neuronal osmosis and myelin synthesis whereas NAAG is a key neurotransmitter. NAA and NAAG are highly present in the brain; their synthesis and catabolism take place in the brain and are highly regulated and compartmentalized [34]. This high-level homeostasis is consistent with a key function of these components in the central nervous system. Thus, the impact of NAA metabolism is illustrated by the brain damages associated with the NAA catabolic enzyme called aspartoacylase in Canavan disease, an early-onset spongiform leukodystrophy [35]. It has also been reported that the NAA signal obtained using magnetic resonance spectroscopy is reduced in metachromatic leukodystrophy, Krabbe disease and other lysosomal storage diseases [36, 37].

A recent study reported metabolomics profiling in serum from MPS IIIA and MPS IIIB patients. Our results are in accordance with this study, which showed notable metabolic disturbance of key amino acids indicating profound metabolic pathway remodeling. Interestingly, NAA levels were decreased in these patients compared to the control patients [38].

Conclusion

In this study, urine global metabolomics profiling revealed profound metabolic impairments in patients with MPS III. The identification of pathological metabolomics signatures may provide better understanding of the pathophysiological mechanism underlying these diseases and thus allow therapeutic innovation in such rare conditions.