Metachromatic leukodystrophy genotypes in The Netherlands reveal novel pathogenic ARSA variants in non-Caucasian patients

Metachromatic leukodystrophy (MLD) is an autosomal recessively inherited sulfatide storage disease caused by deficient activity of the lysosomal enzyme arylsulfatase A (ASA). Genetic analysis of the ARSA gene is important in MLD diagnosis and screening of family members. In addition, more information on genotype prevalence will help interpreting MLD population differences between countries. In this study, we identified 31 different ARSA variants in the patient cohort (n = 67) of the Dutch expertise center for MLD. The most frequently found variant, c.1283C > T, p.(Pro428Leu), was present in 43 (64%) patients and resulted in a high prevalence of the juvenile MLD type (58%) in The Netherlands. Furthermore, we observed in five out of six patients with a non-Caucasian ethnic background previously unreported pathogenic ARSA variants. In total, we report ten novel variants including four missense, two nonsense, and two frameshift variants and one in-frame indel, which were all predicted to be disease causing in silico. In addition, one silent variant was found, c.1200C > T, that most likely resulted in erroneous exonic splicing, including partial skipping of exon 7. The c.1200C > T variant was inherited in cis with the pseudodeficiency allele c.1055A > G, p.(Asn352Ser) + ∗96A > G. With this study we provide a genetic base of the unique MLD phenotype distribution in The Netherlands. In addition, our study demonstrated the importance of genetic analysis in MLD diagnosis and the increased likelihood of unreported, pathogenic ARSA variants in patients with non-Caucasian ethnic backgrounds. Electronic supplementary material The online version of this article (10.1007/s10048-020-00621-6) contains supplementary material, which is available to authorized users.


Introduction
Metachromatic leukodystrophy (MLD, OMIM #250100) is an autosomal recessively inherited sulfatide storage disease caused by deficient activity of the lysosomal enzyme arylsulfatase A (ASA). The disease is characterized by progressive central and peripheral demyelination, resulting in severe neurological deterioration. The most prominent signs and symptoms are ataxia, André B.P. van Kuilenburg and Nicole I. Wolf are co-senior authors.
Electronic supplementary material The online version of this article (https://doi.org/10.1007/s10048-020-00621-6) contains supplementary material, which is available to authorized users. spasticity, cognitive decline, behavioral disturbances, peripheral neuropathy, and eventually a severely disabled state with epilepsy, painful spasticity, loss of motor and communication skills, and premature death [1]. Based on the age of symptom onset, three main MLD phenotypes can be distinguished: late-infantile (< 30 months), juvenile (2.5-16 years), and adult (> 16 years) MLD, with often an extra distinction between early-juvenile (2.5-6 years) and late-juvenile (6-16 years) MLD patients. A severe phenotype with symptom onset at a younger age, faster disease progression and shorter life expectancy, is usually accompanied by lower levels of residual ASA activity. However, a close correlation between disease severity and residual ASA activity could not be established [2][3][4].
Residual ASA activity levels are (partially) dependent on the two functional types of pathogenic ARSA variants that could be present: those resulting in inactive ASA (0-alleles, r.0), including the common splice donor site variants c.465 + 1G > A (r.0) and c.1210 + 1G > A (r.0), and those resulting in some residual ASA activity (R-alleles), including the frequently found missense variants c.1283C > T, p.(Pro428Leu) and c.542T > G, p.(Ile181Ser) [5]. Carriers of one pathogenic ARSA variant typically have reduced ASA activity although far above the ASA activity range of MLD patients [4,6]. Although determination of ASA activity is very useful, in combination with the clinical symptoms and genetic findings, to diagnose MLD, it is not able to distinguish the different MLD phenotypes. ASA activity levels (1) show considerable variability between patients with the same clinical phenotype, even within families [7,8]; (2) can vary within individual patients at repeated testing due to inter-assay variability [7]; and (3) can be reduced within the range of MLD patients in healthy individuals carrying two copies of the pseudodeficiency (Pd) allele c.1055A > G, p.(Asn352Ser) + * 96A > G [9]. Due to the high frequency of the Pd allele, MLD patients may also have one or two copies of this allele in addition to their pathogenic ARSA variants. Some ARSA variants have been found to inherit in cis with the Pd allele [6]. Therefore, accurate genetic analysis of the ARSA gene is necessary in MLD diagnosis, especially when screening of (pre-symptomatic) family members is indicated [4].
In this study, we report the prevalence of pathogenic ARSA variants and MLD phenotypes in the patient cohort of the Amsterdam Leukodystrophy Center, a Dutch nationwide expertise center.

Patient data
In this retrospective study, approved by the institutional review board and with appropriate consent of patients/their guardians, we included 76 patients who were referred to the Amsterdam University Medical Center (Amsterdam UMC) with a confirmed diagnosis of MLD. We considered MLD to be confirmed when at least two different tests were compatible with the diagnosis: homozygosity or compound heterozygosity for two pathogenic ARSA variants, increased urinary sulfatide excretion and/or decreased ASA activity within the range of MLD patients analyzed according to the Baum assay or modified Baum assay either at 0 or 4°C depending on the laboratory [10][11][12], or when genetic testing of an affected sibling identified the same pathogenic ARSA variants as the index patient. In all but two patients, ARSA was tested directly by Sanger sequencing. In these two patients, ARSA mutations were detected using next-generation sequencing techniques (whole exome sequencing (WES) in a patient with unexplained polyneuropathy (MLD-67) and WES-based testing of a leukodystrophy gene panel in a patient with unexplained brain white matter abnormalities (MLD-81)). For most patients, both parents were tested to confirm the presence of mutations on two different alleles.
Genetic and biochemical tests were performed in the metabolic laboratory of the Amsterdam UMC, in laboratories collaborating with the referring hospitals including the metabolic laboratories of the Erasmus University Medical Center (Rotterdam), Radboud University Medical Center (Nijmegen), and University Medical Center Groningen (Groningen), or at both locations. Results of genetic analysis of the ARSA gene, ASA activity, urinary sulfatide excretion, and data on ethnicity, sex, age of symptom onset, presenting symptoms, presence of affected siblings, and consanguinity of the parents were collected from patient records. Patients who had not undergone genetic testing were excluded from this study.
According to the age of symptom onset, patients were grouped into a late-infantile, early-juvenile, late-juvenile, and adult phenotype. Reference-corrected residual ASA activity levels were constructed by expressing the ASA activity level of each individual patient as a percentage of the mean and the lowest boundary of the used reference values and in number of standard deviations (s.d.) from the mean when available. These reference-corrected residual ASA activity levels were used to examine the correlation between MLD phenotype and mean residual ASA activity with the Spearman rank correlation test and the independent sample t test in RStudio (version 3.6.1).

ARSA variants
We reported all identified ARSA variants according to the current nomenclature guidelines (http://www.varnomen. hgvs.org) [13] and GenBank accession number NM_ 000487.5 (https://www.ncbi.nlm.nih.gov/nuccore/NM_ 000487.5) [14]. We consulted different databases, including 1000 Genome, Clinvar, and the Human Gene Mutation Database (HGMD), to investigate whether variants had previously been reported. For novel variants, we indicated whether they were missense, nonsense, silent or splice-site variants, or in-frame, frameshift deletions, insertions, duplications or indels. Furthermore, when available, data on parental presence of variants were analyzed to establish whether variants occurred de novo, whether they were biallelic or monoallelic, and whether they inherited in trans or cis with the Pd allele (when present). Potential pathogenicity of the variants was predicted in silico using MutationTaster, SIFT (Sorting Intolerant From Tolerant), PROVEAN (Protein Variant Effect Analyzer), and PolyPhen-2. For all variants, we investigated whether the affected amino acid was involved in the catalytic site or dimer interface of ASA based on the location within the crystal structures of human ASA monomer and octamer (PDB-ID: 1AUK) using PyMol [15,16]. Fig. 1 was prepared with RStudio (version 3.6.1), using the packages "ggplot2" and "packcircles" [17]. Fig. 2 was made with PyMol based on the crystal structures of human ASA monomer and octamer (PDB-ID: 1AUK) [15,16].
For one variant, c.1200C > T, p.(=), the effect on exonic splicing of ARSA pre-mRNA was analyzed by reverse transcriptase-polymerase chain reaction (RT-PCR) using the primers 5′-GTATCGGAAAGAGCCTGCTG-3′ and 5′-ACGTTATCAGGCACAAACCC-3′ and the PCR conditions as specified by the manufacturer. For this purpose, peripheral blood was collected in PAXgene collection tubes, and mRNA was extracted using a PAXgene Blood RNA Kit (#762164, PreAnalytiX, Qiagen/BD) according to the manufacturer's instructions. Fig. 3, demonstrating the sequence of the mutated ARSA cDNA fragment of the patient compared to the ARSA cDNA sequence of a control, was made with SnapGene 4.2.9.

Patient characteristics
Results of genetic analysis of the ARSA gene were available for 67 of the 76 MLD patients. The other nine patients had not undergone genetic testing and were therefore excluded from the study. Sixty-one of the 67 included patients had a Western European (Caucasian) ethnic background and six had a non-Caucasian ethnic background. The age of symptom onset ranged from 12 months to 36 years. Eleven patients (16%) had the late-infantile onset type of whom five had a non-Caucasian ethnic background, 39 patients (58%) had the juvenile onset type of whom fourteen patients with the early-juvenile (36%) and 25 patients with the late-juvenile (64%) phenotype, and seventeen patients (25%) had the adult onset type of MLD. An overview of individual patient characteristics is presented in Supplementary Table 1. Patients with later onset forms did not necessarily have higher residual ASA activity levels.
Mean ASA activity per MLD phenotype, calculated as percentage of the mean and lowest boundary of the used reference values and in number of s.d. from the mean, are given in Supplementary Table 2. These referencecorrected means of residual ASA activity had also no or only a (very) weak correlation with the MLD phenotype (Spearman rank correlation coefficients of 0.11 (p = 0.41), 0.26 (p = 0.05), and 0.13 (p = 0.36), respectively), also when analyzed in homozygous MLD patients only (Spearman rank correlation coefficients of − 0.08 (p = 0.79), 0.10 (p = 0.71), and 0.00 (p = 1.00), respectively).

Prevalence of ARSA variants
We identified a total of 31 different ARSA variants. The most frequent variant was c.1283C > T, p.(Pro428Leu), accounting for 44% of all pathogenic ARSA variants. Heterozygosity for this variant was observed in 26 patients (thirteen early-juvenile, nine late-juvenile, and four adult MLD patients), and homozygosity for the c.1283C > T variant was observed in seventeen patients (nine latejuvenile and eight adult MLD patients). In patients carrying the c.1283C > T variant, a strong correlation between MLD phenotype and reference-corrected means of residual ASA activity could again not be established (Spearman rank correlation coefficients of − 0.06 (p = 0.72; "mean corrected"), 0.06 (p = 0.74; "lowest boundary corrected"), and − 0.13 (p = 0.48; "s.d. corrected")). Of all patients homozygous for the c.1283C > T variant, juvenile MLD patients showed even higher reference-corrected means of residual ASA activity compared to adult MLD patients ("mean corrected": 11.4 vs 6.2, p = 0.003; "lowest boundary corrected": 20.7 vs 14.9, p = 0.018; "s.d. corrected": − 2.7 vs − 3.0, p = 0.314). All patients with the c.1283C > T variant had a Western European (Caucasian) ethnic background, and none of the patients homozygous for c.1283C > T became symptomatic before the age of 6 years. Importantly, none of the late-infantile MLD patients carried the c.1283C > T variant. The prevalence of each one of the other 30 pathogenic ARSA variants in our cohort ranged between 1 and 8%. The comparative prevalence of all ARSA variants in our cohort with their distribution of MLD type, zygosity, and ethnicity is shown in Fig. 1. In two siblings with MLD, only one pathogenic ARSA variant was detected. Finally, the Pd allele c.1055A > G, p.(Asn352Ser) + * 96A > G was identified in 10 (15%) MLD patients. Heterozygosity for this variant was observed in eight Caucasian patients (one late-infantile, two early-juvenile, two late-juvenile, and three adult MLD patients), while homozygosity for this variant was observed in two non-Caucasian patients (one late-infantile and one late-juvenile MLD patient).

Identification of ten novel variants
Five out of the six patients with a non-Caucasian ethnic background, two of whom were siblings, had a previously unreported pathogenic ARSA variant. In total, we identified ten novel pathogenic ARSA variants in thirteen patients from eleven unrelated families. An overview of clinical and genetic characteristics of these patients is presented in Table 1. Almost half (6/13) of these patients suffered from the lateinfantile MLD form. Two late-infantile patients, both from consanguineous families, were homozygous for the novel variants (c.1123_1126del, p.(Leu375Serfs*47) or c.905G > A, p.(Cys302Tyr)). Except for these patients and one latejuvenile patient being homozygous for the novel variant c.1200C > T, p.(=), all other patients were compound heterozygous with a second, previously reported ARSA variant. Based on parental genetic data, we were able to establish that none of the variants occurred de novo and that the c.1200C > T variant was inherited in cis with the Pd allele c.1055A > G, p.(Asn352Ser) + * 96A > G. The distribution of the novel variants over the ARSA gene and their location within the ASA protein are shown in Fig. 2.

Characterization of novel variants
The ten novel variants included four missense variants, two nonsense variants, two frameshift variants (one single-base pair duplication and one four-base pair deletion), one in-frame indel, and one silent variant. None of these variants were present in 1000 Genome database, Clinvar, and HGMD. All novel variants were predicted to be disease causing by MutationTaster. All missense variants were predicted to be deleterious and probably damaging by SIFT, PROVEAN, and Polyphen-2. The inframe indel variant (c.836_837delTCinsAA) was predicted to be possible damaging with HumDiv and HumVar scores of RT-PCR analysis on ARSA mRNA extracted from the patient's blood showed indeed multiple splicing errors including partial skipping of exon 7 (Fig. 3).
The sixth patient with a non-Caucasian ethnic background (MLD-34) had a North African ethnicity. Presenting signs were absence seizures, ataxia, and spasticity from age 12 months, and ASA activity was 7.0 nmol/h/mg (ref: -90 nmol/h/mg). She was homozygous for the c.545C > G, p.(Pro182Arg) variant and the Pd allele and had consanguineous parents. This variant was previously reported in one early-juvenile MLD patient heterozygous for this variant (second allele c.1283C > T) without data on ethnic background or the presence of the Pd allele [18].

Discussion
The late-infantile MLD type is the most prevalent worldwide (48% of all patients) among MLD patients [3]. In contrast, we observed that in The Netherlands the juvenile type is much more common (58% of all patients). We showed that this is due to the high frequency of the c.1283C > T, p.(Pro428Leu) missense variant in Dutch MLD patients, a pathogenic ARSA variant that affects the stability of the ASA octamer by lowering the acidic pH [19]. The fact that this variant was not found in any late-infantile MLD patient and was found only in a heterozygous state in early-juvenile MLD patients confirms that this variant is associated with a later disease onset [3,5,20], however without strict correlation with relatively high residual ASA activity levels. The common observation that patients with later onset forms do not necessarily have higher residual ASA activity levels might be caused by multiple factors; e.g., ARSA variants might influence other ASA properties in addition to its activity in blood leukocytes. It is however important to consider that the ASA activity levels in this study were measured in different laboratories. Other possible explanations are therefore assay performance differences between laboratories or inter-assay variability.
In addition, we identified ten novel pathogenic ARSA variants. Two missense variants affected the same amino acid as missense variants previously reported as pathogenic. These two missense variants are the c.905G > A p.(Cys302Tyr) variant corresponding to the c.905G > T, p.(Cys302Phe) [21], and the c.371G > A, p.(Gly124Asp) variant corresponding to the c.370G > A, p.(Gly124Ser) and c.370G > T, p.(Gly124Cys) variants [22,23]. Moreover, we identified the c.1200C > T variant as a potential disease-causing silent variant. Silent variants are often considered to be non-disease-causing since the amino acid sequence and subsequently protein structure and function are thought not to be altered [24]. However, the c.1200C > T variant seems to result in multiple splicing errors, and therefore, in agreement with segregation data in this family, to be a pathogenic ARSA variant. However, at this stage we cannot exclude the possibility that the observed altered pre-mRNA splicing of ARSA is due to an unknown intronic variant, in cis with the c.1200C > T variant. Taking into account the residual ASA activity and late-juvenile phenotype, it is likely that low levels of mRNA are spliced properly. Nevertheless, this is speculation and has not yet been investigated by analyzing multiple cDNA clones.
Remarkably, the presented group of patients with a novel ARSA variant had a much higher proportion of MLD patients with the late-infantile type (46%) and a non-Caucasian ethnicity (38%) compared to our full patient cohort (respectively 16% and 9%). This might be caused by the natural population prevalence of ARSA variants but might also be influenced by underdiagnosing and underreporting MLD patients with other ethnic backgrounds [25,26]. This is a point of concern, considering global migration and in case genetic tests are employed that only test for common ARSA variants. False-negative MLD diagnosis could result in withholding treatment opportunities for otherwise eligible patients, especially since experimental therapies including gene therapy and intrathecal enzyme-replacement therapy are evolving. More phenotype information on pathogenic ARSA variants will also help interpreting results from the pilot   [33].
To conclude, with this study we provide a genetic base of the unique MLD phenotype distribution in The Netherlands. Our study also demonstrates the importance of genetic analyses in diagnosing and phenotyping MLD patients and stresses the need for sequencing the entire ARSA gene in patients with suspected MLD, instead of screening only for common ARSA variants. In case of variants of unknown significance and to confirm pathogenic variants, ASA activity in leukocytes or fibroblasts should be measured. Clinicians should be aware of unknown pathogenic ARSA variants in patients with various ethnic backgrounds.
Funding information This study was funded by the Dutch charity organization Metakids. The funding source had no role in the design, analyses, reporting of the study or in the decision to submit the manuscript for publication. The authors of this publication affiliated with the Amsterdam Leukodystrophy Center are members of the European Reference Network for Rare Neurological Diseases -Project ID No 739510.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.

Consent for publication Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.