Introduction

Aspartoacylase (EC 3.5.1.15; UniProt ID: P45381) (ASPA) – also known as aminoacylase II (ACY2) [1] or N-acetyl-L-aspartate amidohydrolase [2] in older literature, is a 35.7 kDa, 313 residue enzyme that catalyzes the hydrolysis of N-acetyl-L-aspartic acid (NAA) into acetate and aspartate [3,4,5]. In humans, the ASPA gene is located on the short arm of chromosome 17, spans 29 kb and contains 6 exons [3, 6]. The ASPA mRNA is widely distributed, but particular expressed in kidney and in oligodendrocytes of the brain [7,8,9]. The encoded enzyme is a single domain homo-dimeric protein complex, with a highly specific active site buried within a channel in the native protein [10]. Insufficient ASPA activity caused by germline ASPA variants is linked to Canavan disease (CD) (OMIM: 271900), a recessive, neurodegenerative leukodystrophy, where oligodendrocytes fail to correctly myelinate neuroaxons [11]. While the precise pathogenic mechanism remains elusive, several non-mutually exclusive hypotheses have been proposed [12,13,14,15].

Clinically, CD patients display poor muscle control, decreased cognitive capabilities and other severe conditions. Typically, these symptoms appear early, already within the first 3–6 months of life, and gradually progress over time eventually leading to an early death [16]. Various attempts at ameliorating the symptoms [17,18,19] and curing the disease [20,21,22] have been reported, including more recently promising attempts and ongoing trials with gene therapy [21, 23, 24]. Accordingly, clinical classification and a detailed understanding of how pathogenic ASPA gene variants operate are highly warranted. Using deep mutational scanning technologies, it was recently shown that most loss-of-function missense variants cause a structural destabilization of the ASPA protein structure, leading to the formation of non-native ASPA proteins that negatively affect cell fitness and are subject to rapid degradation by the cellular protein quality control (PQC) system [25]. Accordingly, studies on ASPA, therefore also provide a model system for understanding intracellular protein folding, misfolding and the PQC system, which ultimately may further our understanding of CD and other protein misfolding diseases.

Here we comprehensively summarize the physiological, cellular, and molecular details of the ASPA enzyme, its substrate NAA and Canavan disease. We focus particularly on the pathogenic ASPA gene variants, the importance of variant classification and gaining a mechanistic understanding of how loss of function gene variants operate for future implementation of gene therapy or other forms of precision medicine.

ASPA gene expression

While ASPA gene expression is elevated in the brain and even higher in the kidneys [1, 2, 7, 26,27,28,29], the protein also seems to be present to a lesser extent in several other tissues, including liver, intestine and lung [1, 7, 30,31,32]. Indeed, skin fibroblasts have also been used to obtain ASPA for activity assays [32,33,34].

However, the most important site of ASPA expression is the brain white matter (WM), where the enzyme plays an essential role in NAA catabolism, as evident by the following investigations with rats. Some studies, using antibodies, detected ASPA protein clearly in oligodendrocytes, and faintly in neurons and microglial-like cells, but not in astrocytes [35, 36]. These observations were corroborated by two other studies of both ASPA mRNA [8] and protein [37], which found it to be restricted [37] or primarily restricted [8] to oligodendrocytes. Similarly, one study detected ASPA enzymatic activity in oligodendrocytes, but not astrocytes [38]. However, yet another study found ASPA activity in O2A progenitor cells and both their differentiated cell types (astrocytes and oligodendrocytes), but not in neurons [39]. It has been reported that rat Schwann cells (the peripheral nervous system equivalent of oligodendrocytes) do not express ASPA mRNA [8], while a later study in mice found Schwann cells to express ASPA. However, the authors noted that the peripheral nerves looked grossly normal in CD mice, thus emphasizing the role of ASPA primarily within the central nervous system (CNS) [30]. This was corroborated by a study of the auditory processing in the same mouse strain, showing functional and morphological deterioration was limited to the CNS, with the cochlear nerve fibers being unaffected [40].

Regardless of these discrepancies, there seems to be consensus that ASPA is mainly associated with oligodendrocytes and their myelin sheaths. A notion supported by the fact that oligodendrocyte-specific ASPA knock-out in mice causes the same – albeit milder – phenotype in the CNS as whole body ASPA knock-out [41]. Further supporting this idea, is the low ASPA activity observed in grey matter (GM) [39], and the fact that the brain periphery remains unaffected in CD patients [29]. Even within the brain WM, ASPA expression exhibits spatiotemporal regulation [35, 39]. Temporally, little to no ASPA seems to be present in neonatal rats, with levels starting to rise postnatally, coinciding with the myelination of the brain for it to then decrease somewhat while maintaining a detectable level in adult rats [8, 37, 39, 42].

ASPA protein structure

ASPA consists of an N-terminal (residues 1–212) and a C-terminal domain (resides 213–313) (Fig. 1). The N-terminal region consists of a central six-stranded mixed β-sheet surrounded by eight helices of variable size and multiple connecting loops. The C-terminal domain consists of two antiparallel β-sheets that wrap around the substrate-binding side of the N-terminal domain with a globular portion between the two β-sheets [10]. The two domains connect though various interactions, including a β-sheet anchor formed between β1 and β13 at the very N- and C-terminal ends of the domain (based on the solved structure of rat ASPA and the AlphaFold predictions for human ASPA). Additionally, the C-terminal region of ASPA wraps around the N-terminal region via the antiparallel β-strands β8 and β12 [10]. Consequently, the N-terminal domain is not stable when expressed on its own [25, 43], and the two domains should thus be considered as one joint unit. Together, the N- and C-terminal regions form a channel leading to the active site.

Fig. 1
figure 1

The ASPA protein structure. The structure of the human ASPA homodimer (PDB: 2O53) [44] is shown with the two subunits in blue and yellow. The N-terminal region covering residues 1–212 (upper panel) and C-terminal region spanning residues 212–313 (lower panel) are highlighted

Sequence alignments [45] and structural analyses [10, 46] have demonstrated similarities between the N-terminal part of ASPA and a range of Zn2+-dependent carboxypeptidase A-related hydrolases [10, 45]. However, carboxypeptidase A has a ~ 60 residue N-terminal extension of the central β-sheet by two strands and carboxypeptidases completely lack the C-terminal extension [10]. More specifically, ASPA belongs to the succinyl glutamate desuccinylase/aspartoacylase family (AstE/AspA, PFAM04952) [10]. The C-terminal region in ASPA likely reflects a requirement for high substrate specificity for ASPA, which is localized in the cytosol, compared to carboxypeptidase A, which cleaves a range of peptides in the small intestine.

Unlike aminoacylase I, which hydrolyses N-acetyl groups from all amino acids, ASPA (also known as aminoacylase II) exhibits high substrate specificity towards N-acetyl-L-aspartic acid [16, 47].

The C-terminal extension of ASPA shields the active site, restricting access to it. In the entry channel R71, K228, K291, K292 and E293 provide a positive electrostatic potential, which may guide NAA to the active site while repelling positively charged metabolites. Although some peptides may enter the channel, the C-terminal region would orientate them in a position that does not enable hydrolysis to occur [10, 48]. A tight pocket consisting of residues T118, Q184, F282, E285, A287, and Y288 accommodates the acetyl group of NAA, while restricting compounds with acyl groups longer than acetate. ASPA also shows high selectivity towards the aspartate-side of NAA [49], possible due to a hydrogen bond between R168 and the β-carbonyl group of NAA [10]. The key catalytic core residues include: R63, N70, R71, R168, E178 and Y288 as well as H21, E24 and H116, which coordinate the catalytic Zn2+ ion (Fig. 2).

Fig. 2
figure 2

The active site. A zoom in on the active site within one subunit of the human ASPA structure (PDB: 2O53) [44] and residues (H21, E24, H116) coordinating the Zn2+ ion (red). Residues Arg63, Asn70, Arg71, Arg168, and Tyr288 interact with the substrate

A “promoted-water pathway” mechanism similar to that of carboxypeptidase A, has been proposed for the hydrolysis catalyzed by ASPA (Fig. 3). In this model, E178 deprotonates a water molecule, which is stabilized by Zn2+. The resulting hydroxide then attacks the β-carbonyl group of NAA, which is stabilized by R63 and possibly also the Zn2+ ion (Fig. 3A), to allow the formation of a tetrahedral intermediate (Fig. 3B). Lastly, the intermediate collapses with aspartate being eliminated as the leaving group [10] (Fig. 3C). The model is supported by computational analyses, which also indicated that substrate release, rather than bond cleavage, is the rate limiting step of the reaction [50].

Fig. 3
figure 3

Overview of the proposed catalytic mechanism of aspartoacylase. A First, water initiates a nucleophile attack on the NAA carbonyl group leading formation of B a tetrahedral intermediate and finally C the products. Green indicates residues coordinating Zn2+, purple indicates residues interacting with the substrate, and yellow indicates the catalytic active E178 residue

Homodimer formation and its functional implications

Solved crystal structures of human and rat ASPA shows that they form similar homodimers. In humans, the dimer interface covers ~ 1200 Å2 of surface accessible solvent area and involves 12 hydrogen bonds and two salt bridges [10, 44]. This structural evidence of dimerization has been supported by various biochemical assays performed on human and rat ASPA [10, 37, 46, 49]. However, some of these observations could be explained by aggregation [49] or antibody cross-reactivity [26, 37]. Additionally, size-exclusion chromatography showed the ASPA monomer is enzymatically active, but does not exclude the possibility of an ASPA homodimer [26]. Hence, despite the data supporting an ASPA homodimer, the importance of the dimer formation remains somewhat elusive, but could allow for allosteric and cooperative regulation of the enzyme.

In agreement with this, human ASPA produced from Pichia pastoris has been shown to display unusual enzymatic properties. Thus, at low NAA levels, the enzyme showed sigmoidal behavior indicative of subunit cooperativity. Conversely, at high NAA levels significant substrate inhibition was observed, indicative of non-competitive inhibition of ASPA through non-catalytic NAA-binding sites on ASPA. Notably, similar behavior was observed when using the alternative substrate N-trifluoroacetyl-L-aspartate (trifluoro-NAA) [46].

Molecular dynamics (MD) simulations and molecular docking studies of ASPA have attempted to elucidate some of the above observations, revealing that monomeric ASPA exists predominantly in a “closed” conformation where access to the active site is hindered by the “gate residues” mainly constituted by the R71-E293 salt bridge and the hydrogen bonded pair Y64-K291. However, in the dimeric state, one monomer remains in the “closed” conformation, whereas the other one fluctuates between the “open” and “closed” conformations, allowing substrate entry to the active site. Additionally, an activating allosteric NAA binding site was observed for each monomer, as well as an inhibitory binding site near the dimer interface. The activating site includes residues R56, K59, and K60, is easily accessible and has a predicted binding free energy of -6.5 kcal/mol, whereas the inhibitory site, involves the side chains of K292 and R233 as well as the main chains of E290 and G237, is more secluded in the structure, and has a predicted binding energy of − 4.7 to − 4.8 kcal/mol [48, 51]. Despite, the two monomers in the dimers being close to identical [44], the simulations found their dynamic properties to differ, which could explain the existence of one shared inhibiting NAA allosteric site between the two subunits, and why one subunit remains inactive [48]. The positive allosteric site explains the observed sigmoidal behavior [46] at low NAA levels [51]. Conversely, at high NAA levels, binding to the inhibitory allosteric site of one monomer affects the dimer interface communication, leading to a pathway of conformational changes that propagate through the other monomer. This increases the rigidity of the loops with the gate-forming residues in loops 62 − 74 and 282 − 294, thus hindering access to the active site [48]. Yet the biological rationale behind the negative allosteric regulation remains enigmatic.

Considering the allosteric nature of ASPA, where dimer interactions are necessary for efficient catalysis, the risk of dominant negative effects, where a non-functional ASPA variant dimerizes and inhibits the wild-type subunit, seems plausible. However, given the recessive nature of the disease, this is not the case. The catalytic requirement is potentially low, meaning that in a heterozygous individual, the formed wild-type/wild-type dimers are sufficient, even if the other ASPA variant is inactive and forms inactive wild-type/variant dimers. Moreover, the rapid degradation of many disease-linked variants [25, 52] also increases the likelihood of forming wild-type dimers in heterozygous cells. Additionally, dimer formation may preferentially be allele-specific since newly synthetized ASPA monomers from the same mRNA are more likely to form dimers. This has been proposed as a general mechanism that buffers proteins against dominant negative effects [53]. Indeed, the large interaction surface between the subunits in the ASPA dimer, suggests that the dimer is stable. In turn, this reduces the risk of forming dominantly negative inactive wild-type/variant dimers. Since the dimer-interface involves both the N- and C-terminal regions [10, 44], co-translational assembly [54] would likely require that the N-terminal part of ASPA co-translationally binds the C-terminal region of a pre-existing or newly synthetized ASPA monomer.

The contributions from allosteric regulation and stabilization from homodimers, makes predictions of variant effects harder, and poses a challenge for genotype–phenotype predictions of CD. Yet, to the extent it occurs in ASPA, cis-assembly of allele specific monomers, will limit these effects.

ASPA enzyme activity assays

With the obvious diagnostic value for Canavan disease, much work has been put into assays to determine ASPA enzymatic activity. One assay utilizes coupling of NAA hydrolysis to NADH oxidation (Fig. 4AB). Here, α-ketoglutarate and aspartate aminotransferase is used to create glutamic acid and oxaloacetate from the aspartate product. In the second reaction, malate dehydrogenase reduces oxaloacetate into malate, simultaneously oxidizing NADH into NAD+. Since only the reduced NADH-form has a noticeable absorbance at 340 nm, NAA hydrolysis can be detected as a drop in absorbance at 340 nm [55]. This assay has been used for many decades [1, 32], with different modifications [39, 56, 57]. A slightly more simple method uses aspartase to convert aspartate into fumarate, which absorbs light at 240 nm [49] (Fig. 4AC).

Fig. 4
figure 4

Overview of the enzymatic reactions used to assay ASPA activity. A ASPA hydrolyzes N-acetyl-L-aspartate (NAA) to acetate and aspartate. The released aspartate is coupled to other enzymatic reactions, such as B aspartate aminotransferase and malate dehydrogenase [32], C aspartase [49], and D aspartate oxidase [58]. The reactions can be followed by spectrophotometry based on the indicated specifications (red)

Some groups have utilized NAA with 14C-labelled aspartate and thin-layer chromatography to assess NAA hydrolysis [26, 43, 59], while others have examined residual NAA using high-pressure liquid chromatography (HPLC) [38, 60]. Somewhat similarly, ASPA activity has been measured as the release of tritium-labelled acetate from 3H-NAA [33, 61]. Furthermore, the distinct peaks in the 1H NMR spectra of NAA and its product acetate, provides another means of measuring ASPA activity, even without the use of isotope-label NAA [62].

Recently, a new activity assay was developed, where aspartate is oxidized using L-aspartate oxidase (Fig. 4AD). This creates H2O2 which can be measured with peroxidase in a fluorimetric assay. While inferior to chromatography-based assays at low NAA concentration samples, the assay is scalable, potentially allowing for high-throughput determination of ASPA enzyme activities [58].

An alternative approach for a high-throughput ASPA activity assay, would be a yeast survival assay, where yeast—complimented with a library of ASPA variants – is grown in a minimal media where hydrolysis of NAA by ASPA constitutes the only nitrogen or carbon source. However, when attempted, using WT S. cerevisiae transformed with vector or WT ASPA, no differences in growth were observed, with either variant growing in carbon-deficient media and both growing equally well in nitrogen-deficient media [52].

Estimates of the specific activity of ASPA vary across the literature, likely due to variations between measuring techniques, protein purification protocols, and buffer conditions. For example, it has been reported that ASPA activity doubled when a phosphate buffer was exchanged for a Tris buffer [56]. In addition, human ASPA appears more stable and active when purified from P. pastoris instead of E. coli [46]. Hence, comparing specific ASPA activities across publications is of limited use. However, enzyme activities of ASPA variants relative to wildtype ASPA can be estimated [7, 43, 60, 63, 64], although the rather poor enzymatic activity of ASPA remains a hurdle for obtaining useful measurements. Examples of reported specific activities of ASPA are listed in Table 1. It is tempting to attribute the low enzyme activity to the inhibitory effect of excess NAA [46], but it is also likely a consequence of the high substrate-specificity [10], and ASPA activity is thus limited by the speed at which products can leave the entry channel [50].

Table 1 Reported specific activities

Canavan disease

Defects in ASPA functionality leads to Canavan disease (MIM# 271900), a type of leukodystrophy [11]. The initial clinical characterization of the disease is accredited a publication by Myertelle Canavan from 1931 [65], who described spongyform degeneration of brain WM in what she initially diagnosed to be a case of Schidler’s disease (another leukodystrophic disease) in a child [16]. Similar brain pathology was described in an earlier report by Globus and Strauss [66], but also diagnosed as Schidler’s disease [67]. In 1949 the phenotype was recognized as a distinct disease by Van Bogaert and Bertrand, who also reported its autosomal recessive inheritance pattern and high prevalence among the Ashkenazi Jewish population [16, 67, 68]. Hence, the disease has also occasionally been referred to as the”Canavan-Van Bogaert-Bertrand” disease [19].

The enzyme was initially purified from porcine kidney [1, 4, 5, 26], but was not linked to Canavan disease until a few decades later. The major breakthrough in the understanding of Canavan disease came with two discoveries. Firstly, elevated NAA levels in the urine (N-acetylaspartic aciduria) was observed for a child with extensive and progressive cerebral atrophy by Kvittingen et al. [69]. Secondly, Hagenfeldt et al. observed similar leukodystrophic symptoms and N-acetylaspartic aciduria in another child, and linked it to ASPA deficiency [32]. Soon thereafter, the symptoms of the two patients and others were recognized to be similar to those described by Van Bogaert and Bertrand [56, 70]. Thus, Canavan disease was characterized as a monogenic disease, caused by insufficient aspartoacylase activity leading to loss of NAA catabolism [71, 72]. Later significant milestones include the cloning of human ASPA cDNA in 1993 [3], and discovery of the first specific mutations in ASPA [73].

Symptoms of Canavan disease

Symptoms of CD typically manifest within the first 3–6 [16] or 0–6 [74] months based on different reports. Common, early symptoms include megalencephaly (enlargement of the head), hypotonia (loss of muscle tone), developmental delays, increased irritability and abnormal eye movements / nystagmus [16, 74]. The triad of hypotonia, head lag and megalencephaly should suggest Canavan disease, when WM involvement is suspected [16]. Past the first 4–6 months of life, the developmental delays and megalencephaly becomes more apparent [16, 74, 75]. Psychomotor development in patients is usually limited to that of a 1-year-old, with few patients acquiring fine motor skills such as the ability to draw or scribble. Likewise, only 3 out of 23 patients in one study were able to speak single words, and none could form complete sentences [74]. With age, hypotonia develops into spasticity [16, 74] and the developmental delays become more apparent [67]. Especially motor and verbal skills are affected, with most Canavan disease children being unable to properly sit, stand, walk or talk [67]. However, in spite of profound delays, CD patients can sometimes interact with others, smile, and reach for objects [75]. In addition to hypotonia, contractures and decubiti has also been reported for some CD patients, which needs to be prevented by exercise and position changes [76]. In these cases, physical therapy is recommended to minimize contractures and maximize motor abilities and seating posture [76, 77]. Additional symptoms may include feeding difficulties, sleep disturbances, and poor vision [16, 74]. Many patients may require assisted feeding through a gastric tube [67, 74, 78] or permanent gastronomy [67], as their ability to swallow voluntarily is lost [75]. Approximately 57% of CD patients also develop seizures [74, 78, 79] often requiring anticonvulsant medication [74]. A newer study found that, while rare in the first year of life, seizures increase in frequency over time in most patients, with the highest frequency towards the end of the first decade of life [74]. Another study put the mean onset of seizures at 9 months of age, also noting that the seizures were generalized tonic–clonic (i.e. involving both tonic (stiffening) and clonic (twitching or jerking) phases of muscle activity) [80].

Previously, many patients succumbed to the disease within the first years of life. However, improved medical and nursing care has extended life expectancy, with a significant number of patients now reaching their second or even third decade of life [16, 74, 78, 81, 82].

Macroscopic and histopathological symptoms

Macroscopically, the brain weight of CD patients under 20 months of age is 150% above normal average, whereas for patients over 30 months it had normalized to 103%. This is most likely attributed to macroencephaly and subsequent brain degeneration, respectively [83]. An ill-defined demarcation of cortical GM and WM has also been observed, likely caused by the poor myelination [83, 84].

Histologically, Canavan disease is characterized by progressive spongiform degeneration of the brain WM [71, 84], where oligodendrocytes, found within the brain WM, fail to myelinate the neuroaxons, thus rendering the neurons unable to function normally [11]. Vacuoles were observed within the myelin sheaths in the subcortical WM [71]. Moreover, a significant number of Alzheimer type II astrocytes were found in the cerebral cortex, cerebellum, and basal ganglia [71, 83], although this was not observed for one investigation, which reported only a few scattered Alzheimer type II astrocytes in both WM and GM [85].

Interestingly, a study reported that neurons appeared normal [83], although mouse studies on ASPA deficient mice have shown vacuolization and axonal loss in the cerebellum, suggesting some extent of neuronal damage does occur [86, 87]. Hypertrophy, hyperplasia [88, 89] and astrocytic gliosis (astrocytosis) [90, 91] have also been reported, with some cells having unusually elongated mitochondria [88, 89, 92].

Different subtypes of Canavan disease

It is apparent that the severity of CD patients differs notably. Accordingly, some literature distinguishes between the common infantile form, and the atypical congenital form and juvenile form [75, 83]. Unlike the infantile form, where symptoms appear after around 3 months, in the congenital form, they appear around or a few days after birth, often leading to death within several days or weeks. Contrary, symptoms in the juvenile form are delayed with the initial symptoms appearing later in life [83]. There are many examples of juvenile CD patients [93,94,95,96,97,98,99,100,101,102,103]. Notably, in certain cases, some symptoms (delayed motoric milestones) appeared early, as seen in classical CD. However, the disease progression remained milder than usual, indicating that even milder and juvenile forms may manifest and be noticeable at the early stages [102, 104]. Conversely, one study reports that six patients with infantile-onset CD survived beyond six years of age, but points out that this might be the result of better medical management and care, rather than evidence of genetic heterogeneity [82]. Indeed, a later paper concludes that the clinical course of CD patients was not due to mutation heterogeneity, but rather reflects the improvement in patient care and other unrelated factors [105]. The observation is corroborated by another study noting that “prolonged survival of patients with early-onset disease, even into the second and third decade, is not uncommon” [78].

Interestingly, one report found no obvious correlation between the severity of the WM degeneration and the clinical presentation [106]. Likewise, another study noted that neither seizures nor basic psychomotor skills (visual tracking and head control) within the first two years of life had a statistically significant influence upon survival [74]. Hence, specific CD symptoms may not always reflect the severity of the disease but vary from patient to patient.

Some literature on variability in the manifestations of the disease, did not find evidence supporting the three distinct forms of CD [78]. Likewise, the categorization was described as “flawed” by a later study, which instead argued for using gene-based diagnosis of typical versus mild Canavan disease [24]. Consequently, rather than three distinct forms, CD phenotypes are likely better described as a spectrum of severity, with the possibility of specific symptoms being more or less pronounced due to genetic or environmental factors. More recently, a CD severity score with assessment of 11 symptoms and abilities was developed [74], which may help to systematically and objectively evaluate CD, to compare across studies, and potentially assign severity to specific ASPA alleles.

Possibly the CD severity, at least in part, reflects the patient genotype, with ASPA variants with slight residual activity resulting in milder forms of CD. However, other genetic or environmental factors may play a role as well [40, 78]. Thus, in a juvenile CD case with two sisters both heterozygous with the same alleles (A305E and R71H), they presented with developmental delays from 19 and 50 months, respectively, suggesting other factors than genotype play a role as well [96].

Another study reported that two CD patients homozygous for A305E variant had a milder phenotype, although most CD patients homozygous for the variant had early onset of CD and severe symptoms [93]. Likewise, in one study of 23 CD patients, macrocephaly seems to occur slightly earlier in girls (7 months) than in boys (8.5 months), hinting towards sexual differences in manifestation of that specific symptom [74].

Phenotypic variation is also seen in the Aspa−/− mice, where some died shortly after weaning, while others survived between 1.5 and 9 months [28]. Differences in severity are also evident between the different Aspa knock-out mouse models [107] (Box 1).

So far, attempts at linking specific variants to CD severity have been limited [43, 63, 64, 74, 93, 101, 108]. However, variants likely to be mild may include: G274R, P181T, Y231C, P257R, I143T, K213E, R71H, Y288C, I170T, G101V and D204H [63, 64, 101]. Given the recessive nature of the disease, variant effects on CD severity needs to be considered in context of the variant expressed from the other ASPA allele. A mild variant may not cause CD at all in a homozygous setting but could yield a mild phenotype when combined with a detrimental variant. Given the ASPA homodimer formation, specific variant combinations may also potentially result in positive or negative epistasis, adding further complexity to the matter.

Box 1 Animal models for Canavan disease

An Aspa knock-out mouse model was constructed by homologous recombination of an Aspa construct with a 10 bp deletion in exon four in ES cells, which were subsequently injected into C57BL/6-Tyrc−Brd blastocysts [28]. In accordance with the recessive nature of CD in humans, heterozygous CD mice had no overt phenotype, whereas homozygous mice displayed various CD symptoms [28]. For instance, the Aspa homozygous knock-out mice, had lower weight (7.23 ± 0.93 g at weaning) compared to heterozygous or wild-type littermates, as well as age and gender-matched controls (14.06 ± 0.84 g) [28]. The CD mice also had a reduced lifespan, with a few mice dying shortly after weaning and others surviving for between 1.5 and 9 months. Other symptoms include macroencephaly with craniofacially abnormalities, ataxia (tremors, splayed legs, slower shaky pace, reduced mobility) with a reduced performance on rotarod tests (1.16 ± 1.69 s vs. 44.9 ± 21.4 s for wild-type and heterozygous controls). The mice were also lethargic, and a subset developed seizures at 6 months. Urine NAA levels were approximately eightfold higher in CD mice (1,541 ng/mg creatinine) vs heterozygotes (184 ng/mg creatinine) or WT (170 ng/mg creatinine). Lastly, brain scanning revealed abnormalities resembling CD.

Two other CD mouse models have since been created. In 2003 [337], the mutagen N-ethyl-N-nitrosourea was used to create a nonsense mutation Q193X, in the Aspa gene. This strain is known as AspaNur7 [337] and later characterized as a new CD model [86]. In 2011, an aspartoacylase-lacZ knock-in mouse model was engineered, where the bacterial β-galactosidase (lacZ) gene is inserted after the Aspa regulatory elements to abolish Aspa expression, [30]. This strain was later used to demonstrate altered central auditory processing, with impaired speed of nerve conduction and hypomyelination of in the central auditory system [40].

In addition to the mouse models, a naturally occurring rat CD model, known as the tremor rat, has been utilized for in vivo studies. The tremor rat carries a deletion spanning at least 200 kb including the four genes, encoding an olfactory receptor, ASPA, the vanilloid receptor subtype I, and the Ca2+/calmodulin-dependent protein kinase IV [27].

While all models show resemblance to CD patients, Ahmed et al. [107] point out that compared to the original mouse model from 2000, the three other rodent models have less severe phenotypes and near normal lifespans. Likewise, Mersmann et al. [30], point out that neurological phenotypes varies considerably between different rodent models.

When working with the tremor rat model, it is hard to attribute phenotypes specifically to ASPA-deficiency rather than other effects of the deletion [15]. Since the rat brain arguably resembles the human brain more than the mouse brain, a clean ASPA deficient rat model would be desirable.

NAA metabolism and the normal role of ASPA

The ASPA substrate, NAA, is found in high concentrations within the brains of mammals and birds [109,110,111,112]. Within the human brain, it is found in concentrations of ~ 10 mM depending on the specific brain area [113, 114], thus making it one of the most abundant amino acids in the brain, second only to glutamate [56]. NAA is synthetized within the mitochondria of neurons [115,116,117] from aspartate and acetyl-coenzyme A by the enzyme aspartate N-acetyltransferase (Asp-NAT or ANAT) [118, 119] encoded by the NAT8L gene [120] (Fig. 5).

Fig. 5
figure 5

Overview of the NAA cycle. NAA is synthetized from acetyl-CoA and aspartate in the mitochondria of neurons by the enzyme aspartate N-acetyltransferase (ANAT), and subsequently transported to the cytosol by unknown transporters. NAA might either be released from neurons or converted into N-acetyl-aspartyl glutamate (NAAG) catalyzed by NAAG synthetase I or II (NAAGS). NAA release likely occurs through ABCC5 as well as other uncharacterized transporters. Upon its release, NAA may be taken up by astrocytes or oligodendrocytes through the sodium-dependent dicarboxylate cotransporter 3 (NADC3) or exchanged through gap junctions. Within oligodendrocytes, ASPA hydrolyzes NAA to aspartate and acetate, which can be utilized by the cell. A fraction of NAA may also end up in the bloodstream. NAAG can be released from postsynaptic dendrites in response to stimulation of ionotropic glutamate receptors. NAAG then acts on presynaptic metabotropic glutamate receptor 3 (mGluR3) to inhibit further presynaptic glutamate release. Additionally, NAAG acts on mGluR3 on astrocytes to induce cyclooxygenase (COX1) activation, which in turn leads to release of prostaglandins to the vascular system. This in turn increased cerebral blood flow (CBF) to the area. Lastly, glutamate carboxypeptidase II/III (GCPII/III), catalyse the hydrolysis of NAAG to NAA and glutamate

Through an unknown transportation mechanism from the mitochondria to the cytosol, NAA accumulates in the neurons, reaching concentrations of up to 20 mM and accounting for 7% of neuron osmolarity [11]. From here NAA is excreted from the neurons. While the efflux of NAA from neurons is poorly understood, it is likely driven by the high intracellular/extracellular NAA concentration gradient [121], likely along with water molecules, thus serving as a means for neurons to expel metabolic water [12, 114, 122]. Transport has been speculated to occur via members of the solute carriers (SLC) superfamily [123]. Most likely, at least some of the efflux can be ascribed to the ubiquitous efflux transporter ABCC5, which is part of the superfamily of ATP-binding cassette (ABC) transporters [124, 125].

NAA release mechanisms

Some data suggest that NAA and its derivative N-acetyl-aspartyl glutamate (NAAG) (further discussed below) are released in response to neuronal depolarization, although there seems to be conflicting data as to whether the release is Ca2+-dependent.

One study found that transient (5 min) N-methyl-D-aspartate (NMDA)-receptor activation (60 µM) induced a long lasting, Ca2+-dependent efflux of NAA in preparations of organotypic slices of rat hippocampus. Interestingly, the NAA efflux did not seem to be directed to cell swelling or depolarization, but rather coupled to Ca2+-influx via the NMDA-receptor. The efflux also seemed to persist for at least 20 min after the omission of NMDA [126].

One rat brain slice superfusion study found that a basal efflux of NAA and its derivate NAAG occurred, which could be increased by 300% by electrical stimulus, that was sensitive to a voltage-dependent Na+ channel blocker, tetrodotoxin [127]. Likewise an older study used the same technique to demonstrate a largely Ca2+-dependent increase in released NAAG in superfusates from rat neocortex, piriform cortex/amygdala, and hippocampus upon depolarization with 50 mM K+ [128]. In rat microdialysis studies examining K+-induced local depolarizing stimuli, NAA levels in the extracellular fluid (ECF) were shown to consistently increase after the stimuli [121, 129], with one of the studies showing that the release occurred irrespective of whether or not Ca2+ was present in the perfusion medium [121].

In magnetic resonance spectroscopy (MRS) studies of NAA, it was shown that photo-stimulation leads to a decrease in NAA levels, likely corresponding to released and subsequently degraded NAA [130]. Later, the same drop in NAA level was shown in rat prefrontal cortex [131] and again in humans [132]. It should be noted, however, that a newer study found that NAAG and NAA in the visual cortex remained constant during continuous visual stimulation [133]. Thus, while we know NAA/NAAG is released, the mechanisms behind the release remain poorly understood. It has been suggested that this efflux of NAA, may be a mechanism of osmoregulation [12, 121, 130, 134, 135].

NAA catabolism and the role of ASPA

Upon its release from neurons, NAA has been suggested to follow a major and minor pathway [11]. In the major pathway, NAA is absorbed by oligodendrocytes and hydrolyzed by ASPA into acetate and aspartate, which, in turn, can be utilized by oligodendrocytes, astrocytes, neurons or passed via the ECF into systemic circulation for use in other tissues. In the minor pathway, NAA may travel from the cerebrospinal fluid (CSF) into the bloodstream and subsequently utilized by other cells or excreted in the urine in barely-traceable quantities [11, 136] (Fig. 5).

The average lifetime of NAA has been estimated to 16–18 h in healthy individuals [12, 114, 137]. In CD patients, the lifetime is likely longer at around 24–48 [11] or 62 [12] hours. In addition, the synthesis rate of NAA in human brain in vivo was measured to 9.2 ± 3.9 nmol/min per g in controls and 3.6 ± 0.1 nmol/min per g in CD patients [137], suggesting a negative feedback regulation of NAA on its synthesis.

NAA uptake and transport mechanisms

Like its excretion from neurons, the uptake of NAA into glia cells is not well described. The sodium-dependent dicarboxylate cotransporter 3 (NaDC3) encoded by the SLC13A3 gene has been shown to be responsible for NAA uptake, along with three sodium ions, in rat astrocytes [138, 139]. Since the transporter is also found in oligodendrocytes [123, 140, 141], NaDC3 is a potential candidate for NAA transport into oligodendrocytes and, to our knowledge, the only transporter suggested in the literature. Accordingly, studies have shown increased NAA levels in the urine of Slc13a3 homozygous knock-out mice [142]. Although this further emphasizes the importance of NaDC3 in NAA uptake [123], it is, without the use of tissue-specific knock-out strains, not possible to conclude whether this effect is due to the lack of NaDC3 in the brain or in the kidneys, where NaDC3 is also found within the proximal kidney tubule cells [143].

Alternatively, connexins (combined in hexameric complexes known as connexons) form gap junctions between astrocytes and oligodendrocytes, thus forming a “glial syncytium”, which may allow the transport of NAA from astrocytes to oligodendrocytes [123, 144].

Rather than connecting with a connexon on another cell, connexons may also form hemichannels, which enable the release of metabolites to the extracellular space. Many gap junctions are found between astrocytes (astrocyte-astrocyte junctions) while there are fewer astrocyte-oligodendrocyte junctions, and few or none between oligodendrocytes themselves or neurons and glia cells [144].

The astrocyte-oligodendrocyte (A/O) gap junctions have been found between astrocyte processes and the oligodendrocyte cell body, its processes, and its outer (abaxonal) layer of the myelin sheath in both WM and GM [144]. The astrocyte-oligodendrocyte gap junctions include connexins Cx26, Cx30, and Cx43 expressed in astrocytes and Cx29, Cx32 and Cx47 expressed in oligodendrocytes [144, 145]. Intriguingly, the importance of connexins in CD seems supported by detrimental effects on oligodendrocytes and myelination observed upon loss of some A/O connexins in both humans and in mice models [146,147,148,149,150]. As has been pointed out by multiple authors [113, 123], these phenotypes resemble those of CD. However, they do not necessarily relate to NAA transport. For example, gene variants in Cx32 have been linked to X-linked Charcot–Marie–Tooth disease, where demyelination occurs in the peripheral nervous system rather than the CNS. Considering that the role of NAA mainly involves the CNS, this suggests that connexins may cause demyelination through mechanisms independent of NAA transport [146]. Indeed, it is easily imaginable that many other implications of defective connexins on astrocytes and oligodendrocytes can disrupt processes such as myelination. However, so far, no studies have addressed NAA transportation in the context of defective connexins and their potential role in the disease phenotype.

N-acetyl-aspartyl glutamate

NAA is generally not considered a neurotransmitter, although one paper argues otherwise, suggesting it acts on the G protein-coupled metabotropic glutamate receptor (mGluR) to induce an inward current that results in excitation of the neurons [151]. However, its derivative N-acetyl-aspartyl glutamate (NAAG) fulfills most criteria of a neurotransmitter while likely also being widely distributed and the third most prevalent transmitter in the mammalian nervous system after glutamate and γ-aminobutyric acid (GABA) [152, 153] as well as the most abundant dipeptide in the brain [154]. Interestingly, the brain distribution of NAA and NAAG seems to be distinct [155].

NAAG is synthesized from NAA and glutamate by NAAG synthetase I and II in neurons [156, 157]. Upon stimulation of ionotropic glutamate receptors on the dendrites of postsynaptic neurons, NAAG is secreted and acts as an agonist on the presynaptic metabotropic glutamate receptor 3 (mGluR3), thereby reducing further glutamate release [157,158,159]. In addition to this retrograde NAA release, presynaptic NAAG release has been reported in retinal neurons [160] and in the neuromuscular junction [161].

NAAG also acts on mGluR3 on astrocytes [125, 159] to induce activation of the cyclooxygenase COX1 in astrocytes leading to secondary release of prostaglandins to the vascular system. This in turn induces a hyperemic response leading to increased blood flow i.e. (increased oxygen and glucose) to the area [125]. Consequently, NAAG plays a role in regulating cerebral blood flow [125, 162]. Equally important, NAAG is catabolized to glutamate and NAA by the membrane anchored carboxypeptidase II [163] and III [164] (GCP-II & GCP-III) expressed on the extracellular face of astrocytes [113, 163,164,165,166], thus constituting a second source of NAA (Fig. 5). Lastly, it has been reported that NAAG is further modified to N-acetylaspartyl-glutamylglutamate (NAAG2) by NAAGS-II, although the significance of this product is poorly described in the literature [156].

NAA and NAAG release in grey and white matter

One aspect likely useful for elucidating the role of NAA and its related metabolite NAAG in relation to CD, is the distinction between release of the compounds in the brain WM and GM, respectively. However, not much literature has focused on this.

In healthy individuals, it has been reported that NAA levels are higher in GM than in WM, likely due to the greater neuronal density in GM [167,168,169]. Contrary, NAAG is found in higher levels in WM than GM [170,171,172]. In line with these observations, Baslow and Guilfoyle [13] argue that NAA efflux from neurons to ECF occurs in GM in response to neuronal stimulus as a means of fulfilling a osmoregulatory role of NAA, whereas in WM the only source of NAA is from the catabolism of NAAG. The authors point out that GM is highly vascularized, the primary site for energy production, and includes neurons with unrestricted surfaces, which allows afflux and NAA/water efflux to occur readily, unlike WM where such exchange is highly limited due to axonal myelination [13]. They also point to the high NAAG peptidase activity in WM [166], and the fact that of all known metabotropic glutamate receptors, only the target receptor for NAAG, named GRM3, is present in WM [13, 173].

NAA and NMR

Due to its abundance and high visibility with nuclear magnetic resonance, NAA has proven useful for MRS and magnetic resonance imaging (MRI) in the brain [71, 174, 175]. The three equivalent hydrogen atoms of the acetate group resonate in NMR with a single, sharp peak, with a chemical shift of 2.02 ppm relative to the standard tetramethylsilane. While NAA is responsible for the majority of the signal, NAAG, N-acetylneuraminic acid, and underlying coupled resonances of glutamate and glutamine also contribute. In particular NAAG may contribute 15% to 25% of the peak signal [113].

NAA has also been considered a marker of neuronal health, with low NAA levels, indicative of poor neuronal health, being observed in multiple neurological diseases such as schizophrenia, amyotrophic lateral sclerosis, multiple sclerosis, epilepsy, Alzheimer’s disease, and Parkinson’s disease [49, 71, 88, 175]. Contrary, an elevated NAA level is an indication of CD [113].

Other roles of NAA

While NAA is clearly most well-described for its roles in the brain, it also seems to play a role in several other cell types. Indeed, NAT8L is also expressed in brown adipose tissue (BAT) at levels comparable to the brain, and to a lesser extent in white adipose tissue (WAT) as well, suggesting NAA synthesis also occurs in these cell types [176]. Notably, ASPA is also expressed in both BAT and WAT at levels similar to the brain [177], and a role of NAA in these tissues seems further supported by the phenotype of ASPA-deficient mice (lower total body-fat percentage and weight) [41, 178]. Interestingly, body weight is restored to wild-type levels in Nat8L/Aspa double knock-out mice [41]. It should be noted that Madhavarao et al. [179] reported no difference in animal weight between WT and ASPA KO mice. However, this is likely due to the age of the animals, as 17 days old animals were, while in Surendran et al. the weight difference is only noticeable with four week old mice [178] and in Jonquieres et al. [41], the weight difference was not evident between two and six months.

It has also been suggested that NAA, through the minor pathway, may provide a brain-specific mechanism for excretion of aspartate-associated nitrogen, similar to the synthesis of urea in the liver [113]. Supporting this idea, is the readiness by which NAA is transported out of the brain, which becomes apparent in CD patients where the NAA levels rise much more in the urine than in the brain [113, 180]. Removing aspartate from neurons through acetylation of NAA would also favor α-ketoglutarate formation from glutamate (catalyzed by aspartate aminotransferase), thus improving energy production through the TCA cycle. This would help accommodate the large energy demands of neurons, without producing ammonia, as would occur when α-ketoglutarate is produced via the glutamate dehydrogenase reaction. Considering these roles of NAA in neuronal energy metabolism, its aforementioned use as a marker of neuronal health is hardly surprising [88].

Furthermore, it has been suggested that NAA may serve as a reservoir for glutamate, thus negating the cytotoxic effects of high glutamate levels [181, 182]. Indeed, NAA and glutamate are inherently linked through metabolic pathways, most notable the tricarboxylic acid and the glutamate–glutamine cycles [113]. A similar role as glutamate reservoir could be proposed for NAAG as well [183].

Between the high NAA abundance in the brain, conserved only within the animal classes with the most advanced brains (birds and mammals) [109], its fast metabolic cycle across multiple cellular compartments, elusive transport mechanisms within the body and importance for neurotransmitters such as NAAG and glutamate, it is hardly surprising that NAA has been mentioned as the most enigmatic free amino acid in the human brain [11].

NAA in Canavan disease

In CD patients, NAA hydrolysis in the oligodendrocytes is abolished. This confers major changes to NAA and associated metabolites. As NAA cannot be hydrolyzed by oligodendrocytes in the brain, more NAA is removed through the minor pathway, where NAA passes from the ECF to the bloodstream and is eventually filtered out to the kidneys. This process leads to the elevated NAA levels in blood (NAA acidemia) [12, 88] and urine (N-acetylaspartic aciduria) [11, 32, 88, 114, 184].

In CD, the neuronal NAA synthesis persists, although only at approximately 38% of the normal rate [12, 114, 137]. Notably metabolic consequences of deficient ASPA activity within the brain also includes loss of the high neuronal NAA-gradient, which may reduce the efficiency of NAA export from these cells. Additionally, as the aspartate from NAA cannot be reused for NAA-synthesis, there is an aspartate deficit, which must be made up for by aspartate production from other metabolic resources [184].

Although ASPA is widely considered as the only known enzyme that can metabolize NAA [185], the enzyme amidohydrolase I, which is highly expressed within astrocytes in the brain and kidney [186, 187], might be able to hydrolyze NAA to some extent, based on results from protein purified from rat [47] and trout [188] brain. This contribution may play a role in CD patients, where ASPA functionality is abolished, but obviously does not restore oligodendrocyte-specific hydrolysis [114].

NAA levels in the brain of CD patients were found to be elevated by approximately 50% compared to healthy controls [11, 24, 189], although one study reported NAA to not be significantly elevated [190]. Based on older publications, the normal NAA brain WM level is 5–10 mM as opposed to 15–20 mM in untreated CD patients [191].

In healthy individuals, NAA levels in urine have been reported in the range from 5 to 20 mmol/mol creatinine, whereas the levels in CD patients were in the range of 391–3073 mmol/mol creatinine [69, 192]. Notably, NAAG was also elevated in the urine of CD patients [193]. In the CSF of a CD patient, the NAA concentration was measured at 611 μmol/L, whereas in 10 control samples, the level was below the detection limit of 2.3 μmol/L [69]. In the same CD patient, the blood NAA concentration was 7 μmol/L [69], as opposed to 0.11 mmol/L [194] and 0.44 μmol/L [195] in healthy individuals. This low concentration compared to urine levels suggests NAA is effectively filtered out of the blood in CD patients [69]. In addition to the elevated blood NAA levels, the NAA aciduria may also be partly explained by the inability to recycle NAA in the kidneys [113], since ASPA activity in kidneys will also be non-functional in CD patients. Notably, variations in measuring techniques and fluctuations between CD patients mean that NAA values should be viewed as indicative rather than absolute defined values.

In addition to elevated NAA levels, 46% reduced glutamate levels were also observed in a CD patient [71, 189]. In the brain of Aspa homozygous knock-out mice, glutamate and its metabolic product γ-aminobutyric acid (GABA), were also shown to be reduced [71, 178, 196], while aspartate was elevated [178]. The reduced glutamate levels, and elevated aspartate levels observed in mice, was by the same study proposed to be related to the reduced aspartate aminotransferase activity reported in the CD mouse model [178]. But whether this is the cause or effect of a dysfunctional metabolism caused by lack of aspartoacylase is hard to say. A differential gene expression analysis of a CD patient found that glutamate, and aspartate metabolism, were significantly dysregulated, along with changes in genes involved in apoptosis, muscle contraction and development, mitochondrial oxidation and inflammation [197]. Indeed, it is possible that indirect metabolic dysfunctions from lack of NAA catalysis plays a significant role in causing or aggravating the symptoms of Canavan disease [184], a notion supported by the benefits of dietary acetate supplementation (as discussed further in the following section).

Etiology of Canavan disease

Various hypotheses have been proposed to explain the etiology of Canavan disease. Notably, several of these are not mutually exclusive, and therefore likely need to be considered in combination to explain the full pathology of CD. In the following, we will summarize the main findings that have led to the present understanding of CD etiology.

Neuronal toxicity

One possibility is that NAA accumulation leads to neuronal excitotoxicity [123], similar to what has been reported for glutamate [198]. It was shown that direct injections of NAA (4 or 8 µmol doses) into normal Wistar rats, led to seizure-like symptoms resembling those observed in CD patients, and lower doses (2 µmol) were sufficient to induce the same seizures in tremor rats [27, 199]. However, presuming NAA is not a neurotransmitter, direct neuronal excitotoxicity of NAA seems unlikely. Rather the seizures observed in CD patients, may be caused by resulting misbalances from the disrupted NAA metabolism such as reduced glutamate (and GABA based on mice studies) [189] and potentially higher NAAG-levels [193, 200], both of which are reported neurotransmitters [152, 201, 202]. A general cytotoxic effect of NAA also seems unlikely, as NAA appears to be elevated in GM [167,168,169], while CD mainly affects the WM. In addition, feeding Sprague–Dawley rats NAA at 500 mg/kg of body weight/day, administered for two consecutive generations, showed no changes in neurobehavioral tests [203]. This could, however, have been due to lack of neuronal uptake, breakdown in the gut or catalysis by ASPA. However, in another study, feeding healthy mice N-acetyl-aspartate monomethyl ester until their brain NAA-levels were comparable to those of CD mice, did not elicit neuropathological abnormalities [204]. In more recent data based on three-dimensional human iPSC-derived myelin spheroids consisting of neurons, astrocytes and myelin sheath-forming oligodendrocytes, adding 5 mM NAA had a toxic effect on oligodendrocyte myelination [205].

The osmotic-hydrostatic hypothesis

The “osmotic-hydrostatic/molecular water pump” hypothesis goes all the way back to Canavan’s case report, in which moderate to extreme edema and signs of increased cerebral pressure were reported [65, 114]. According to this theory, the cycling of NAA from neurons to oligodendrocytes normally acts as a molecular water pump facilitating the transport of water out of neurons [19, 122, 206]. This is accomplished by utilizing the high intracellular/extracellular NAA gradient of neurons to drive water transport up its gradient [12, 19, 206, 207], similarly to what is seen in the efflux N-acetyl-L-histidine water pump [208] and the influx Na+-glucose cotransporter [209, 210]. In CD patients, loss of the osmoregulatory role of NAA and its accumulation could explain the microencephaly observed in CD patients, as well as the widespread leukodystrophy [11, 13]. Likewise, loss of NAA catalysis in myelinating oligodendrocytes, would lead to an accumulation of NAA in the in periaxonal space, increasing the osmotic pressure and subsequently causing the intramyelinic splitting, interlamellar edema and breakage at the paranodal seals, which constitutes the dysmyelination characteristic of CD. This could also explain the vacuolization that is observed in the deep layers of cortex and subcortical WM in progressed stages of CD [11, 122], as well as the macrocephaly [74], increased CSF pressure [122, 211] and indications of increased water content in the brain [24].

Indeed, NAA is an important osmolyte in the brain, estimated to constituting 1% of the brain's dry weight and 3–4% of its total osmolarity [11], likely making it capable of such an osmoregulatory role. Additionally, NAA efflux is thought to occur along with at least 32 water molecules and a cation [114, 134]. The estimated NAA lifetime of ~ 17 h in healthy individuals, compared to > 24 h in CD patients, would also support the notable reduction in the osmoregulatory roles of NAA in CD patients, lending further credibility to the hypothesis [11, 137]. Speaking against the model, however, is the fact that elevating NAA to supraphysiological levels by overexpressing Nat8L did not elicit any neurological deficits [41].

The acetyl-lipid myelin hypothesis

Another theory is the “acetyl-lipid myelin/oligodendroglial starvation” hypothesis, originally proposed in 1966 [212]. This hypothesis states that the acetate released from NAA-hydrolyzation in the oligodendrocytes is important for the lipid synthesis required to make the myelin sheaths. Consequently, in CD patients, oligodendrocytes are unable to acquire sufficient acetate from NAA hydrolysis, leading to improper myelin sheath development, thus explaining the observed dysmyelination [8, 88, 114, 212].

Supporting this theory is the well-documented incorporation of NAA-derived acetate into myelin lipids [114, 212,213,214,215,216]. Furthermore, in rats and other mammals, the rapid rise in NAA levels [109, 110, 214, 217] and ASPA expression [8, 37, 39] in the first few postnatal weeks, coincide with a period of high myelination [218, 219]. Since myelination also occurs prenatally in humans [219], as opposed to mainly postnatally in rats [218], it would be interesting to investigate if this is accompanied by an earlier rise in NAA and aspartate levels as well.

Speaking against this model, abolishing NAA synthesis by deletion of the acetyl aspartate synthase gene (Nat8L) in mice does not seem to affect myelination [220, 221]. The lack of NAA would be expected to cause severe dysmyelination, assuming the acetyl-lipid-myelin hypothesis holds true. However, in a patient with no visible NAA or NAAG spectra (i.e. likely deficient for Nat8L, though not verified genetically) a retardation phenotype, similar to Canavan disease and moderately delayed myelination was reported [222,223,224], contradicting the mice studies.

Another study in CD mice, found that the abolishment of Nat8L suppressed the loss of cerebral cortical and cerebellar neurons otherwise seen in these mice [225], perhaps supporting the osmotic-hydrostatic hypothesis.

Also going against the acetyl-lipid myelin theory is the fact, that restoring ASPA activity in astrocytes rather than oligodendrocytes seems to be sufficient to prevent the CD phenotype [226], although these results could be explained by transfer of NAA-derived acetate or similar metabolites from astrocytes to oligodendrocytes.

The acetyl-lipid myelin-hypothesis also fails to account for the cellular and extracellular edemas observed in CD. Likewise, opponents point out that acetate could be provided to oligodendrocytes by other means than NAA from neurons, e.g. via glucose from astrocytes, indicating that the NAA has additional purposes than merely being a carbon source [13, 114].

The oxidative stress hypothesis

In one study, tremor rats, which were continuously fed glyceryl triacetate (GTA) from 1 week after birth, showed improved motor performance and myelin galactocerebroside content as well as modestly reduced vacuolation [227]. This on one hand suggests NAA is not necessary for myelin sheath development, but simultaneously points to acetate-deficiency as an underlying cause of at least some of the Canavan disease symptoms.

Following that line of thought, the third “oxidative stress” hypothesis, proposed by Francis et al., points at increased oxidative stress due to the deficient NAA-catabolism as the underlying cause of Canavan disease. Supporting this hypothesis are findings that show oxidative stress occurring before oligodendrocyte dysmyelination in the homozygous AspaNur7 mouse model [228]. Later, the same group demonstrated how dietary triheptanoin (a synthetic triglyceride with a carbon chain length of seven, which can be catabolized to provide as a source of acetate [15, 123]) administration reduced oxidative stress and alleviated CD symptoms by increasing myelination, reducing spongyform degeneration and improving motor function. Importantly, this treatment was effective only in younger mice, highlighting the significance of early postnatal myelination events and establishing the therapeutic intervention window accordingly [15]. The theory gains further support from observations showing reduced levels of acetyl-CoA and ATP in ASPA-deficient mice [87, 179, 229], and the fact that NAA is essential for juvenile Nat8L knock-out mice, when on a fat-free diet [120]. Another study found that feeding mice with glyceryl triacetate, which provided a significantly better acetate source than calcium acetate, showed no overt pathology, and increased acetate levels, but not NAA levels [230]. Likely supporting both the oxidative stress and acetyl-lipid myelin-hypothesis, knock-down of ASPA in immortalized brown adipocytes indicated deficiencies in acetyl-CoA and lipid metabolism as shown by transcriptome analysis [177].

Dysregulation of the malate aspartate shuttle

Some of the CD phenotypes may also be the result of dysregulation of the malate aspartate shuttle (MAS) [231]. Mainly important in neuronal cells, MAS is considered to be the major redox shuttle system responsible for maintaining the NAD+/NADH ratio at levels favorable for the oxidative metabolism of glucose [232]. Considering Aspa knock-out mice have been reported to have elevated aspartate levels and reduced glutamate levels [178], both of which are related to MAS, one might suspect some of the phenotypes of CD patients to be caused indirectly by dysregulation of MAS. Interestingly, deletion of the MAS gene Aralar (AGC1/SLC25A12) in mice leads to a drop in aspartate and NAA levels, highlighting its involvement in the NAA cycle [233]. These mice also exhibit hypomyelination, and reduced levels of myelin-specific brain lipids, particularly galactocerebrosides, resembling what is seen in CD [232,233,234,235]. In one observed human case of ARALAR homozygous for the Q590R missense variant, symptoms also conspicuously resembled CD, with normal development during the first months of life, followed by delayed psychomotor development and seizures at months 5 and 7, respectively, poor head control, severe muscular hypotonia, psychomotor retardation and global hypomyelination [236].

Dysregulation of histone acetylation and oligodendrocyte differentiation

Finally, is has been proposed that lack of available acetate may affect acetylation of histones which in turn, is important for oligodendrocyte differentiation [123]. Studies of oligodendrocyte maturation have highlighted the important role of histone acetylation and deacetylation in the epigenetic control of cellular differentiation from oligodendrocyte precursor cells to mature oligodendrocytes [237, 238]. Similarly, highly acetylated nuclear histones H2B and H3, indicative of the existence of non-compact chromatin as seen during early development, were detected in the WM of adult ASPA knock-out mice [239]. Another study performed on oligodendrocyte cultures also showed that NAA treatment resulted in alterations in the levels of histone H3 methylation, including H3K4me3, H3K9me2, and H3K9me3 [240]. While intriguing, a more concise cause-effect relationship needs to be established to corroborate this hypothesis.

More recently, a study in mice showed loss of ASPA activity to shift oligodendrocyte and neuronal markers towards a less differentiated state, which could be improved or normalized by reconstitution of ASPA activity [241]. However, further elucidation of the connection between ASPA activity and cell differentiation is needed if this data is to favor one hypothesis over another. Indeed, as oligodendrocyte myelin sheath synthesis, energy metabolism and epigenetic modifications are inevitably interconnected, the acetyl-lipid myelin-hypothesis, oxidative stress hypothesis and histone acetylation hypothesis may perhaps favorably be lumped together as one larger hypothesis, depending on the results of future studies.

Evidently, further studies into the NAA cycle and its cellular compartments may provide valuable information on the etiology of CD. For instance, using a conditional knock-out to induce ASPA deficiency after the myelination has occurred may reveal the relative contribution from the acetyl-lipid myelin hypothesis and the osmotic-hydrostatic hypothesis, since the prior seems to be mostly relevant for the early stages of life, where myelination is extensive, whereas one would expect the latter theory to be detrimental even after the myelin sheaths have developed. In addition, standardized tests for assessments of CD symptoms in rodent models would help in comparing the results from individual studies.

Possible treatments

Much effort has been put into means to ameliorate the symptoms of Canavan disease and to explore the possibilities for a cure. Examples of palliative treatment include provision of proper nutrition and hydration [19] as well as treatment of seizures with anticonvulsants [19, 75] including acetazolamide, clonazepam, oxcarbazepine, phenobarbital and valproic acid [24]. In an unusual case of a CD patient, clobazam and primidone were administered to prevent frequently intractable seizures, after previous anticonvulsant treatment using phenytoin, levetiracetam and phenobarbital proved unsuccessful in getting the seizures under control [242].

Canavan patients may often benefit from machines assisting in respiratory functions, nebulizers to help administer medication, diverse positioning equipment to help accommodate the hypotonia and feeding pumps [75, 211].

In line with the main hypothesis for CD etiology, most drug treatments have aimed at reducing brain NAA levels and intracranial pressure or provide a supplement to compensate for lost NAA catabolism. Following the success of glyceryl triacetate (GTA) in Aspa knock-out rat [227, 243] and mice [230] models [227, 230], the treatment was tested on humans. While GTA-treatment was well-tolerated, no improvements in motor function were observed, possibly due to the age of the patients (8 months and 1 year [244] & 8 and 13 months [243], respectively). Accordingly, Segel et al. [244] point out that an earlier intervention time point within the first to 3 months of life and before severe symptoms—indicative of irreversible brain damage—appear, is likely to yield better results. In addition, increasing the GTA dosage might be considered given the lack of adverse effects in studies with the compound [227, 230, 243, 244]. By the same rationale, triheptanoin could potentially have beneficial effects as well [15, 123].

Acetazolamide, an carbonic anhydrase inhibitor with diuretic properties [245] and also an anti-seizure drug [246], was demonstrated to reduce the intercranial pressure in CD patients, but not the water content or NAA levels [19, 67, 211, 247].

Lithium and sodium valproate were shown to reduce NAA levels in rats [248]. Likewise, ethanol, pyrazole and several pyrazole-derivatives also demonstrated an ability to reduce brain NAA concentrations [249]. However, when tested in tremor rats, only lithium chloride and not ethanol, pyrazole-compounds nor valproate were able to lower brain NAA levels [250]. Inspired by these findings, lithium citrate has been tested on humans on multiple occasions, with no signs of toxicity. Based on these studies, a decrease in NAA levels, along with slight improvements of some symptoms and more normal myelination, was reported in patients treated with lithium citrate [17, 251, 252]. The mechanism behind NAA reduction by lithium is unknown, although it has been suggested to prevent NAA release from neurons, increase NAA removal from the brain by affecting permeability of the blood–brain-barrier or from blood by affecting renal excretion [252], or by inhibiting the NAA-synthesis pathway [19]. The anti-epileptic drug topiramate was also reported to slow head growth in two CD patients, though the mechanisms remain poorly understood [253].

NAA synthesis has also been targeted as a treatment possibility for CD. This idea has mainly been inspired by studies showing that CD mice lacking one or both Nat8L alleles display less severe spongiform leukodystrophy and neuronal loss [220, 221, 225]. However, complete ablation of N-acetyl synthetase activity is likely undesirable as well. Indeed, Nat8L knock-out mice brain display a reduced amount of sphingomyelin and sulfatide [240]. Another study reported that Nat8L knock-out mice, have reduced myelin basic protein (MBP) level in the prefrontal cortex in juveniles (but not adults), and exhibited several behavioral deficits, which could be ameliorated by feeding glyceryl triacetate [254]. Various other papers have also indicated neurological issues associated with the Nat8L knock-out mice [41, 255,256,257]. Consequently, if a NAA synthesis inhibition strategy is pursued, reduction rather than complete ablation of NAT8L should be considered. Towards that goal, an adeno-associated viral vector carrying a short harpin RNA against Nat8L has been used to suppress spongiform leukodystrophy in neonatal CD mice [258]. Small molecule inhibitors against the NAA synthetase are also being investigated [18, 259, 260].

Gene therapy for treatment of CD

Of course, the ideal treatment for CD would be to restore ASPA functionality in the brain. Various efforts have been put toward this endeavor. Enzyme replacement therapy using purified ASPA, PEGylated to reduce immunostimulation and increase half-life, has been tested on mice [261], inspired by similar approaches used on the enzyme phenylalanine hydroxylase [262]. A follow-up paper was published a few years later, showing the PEGylation increased diffusion of ASPA from capillaries to surrounding tissues, and exhibited reduced immunogenicity [182].

Currently, the most intensely investigated and arguably best chance at curing CD is through the use of gene therapy to restore intracellular ASPA synthesis, ideally to oligodendrocytes, specifically. Indeed, the nature of CD makes it a good target for gene therapy as the severity of the disease, justifies the risks associates with the treatment, which currently entails creating one [23] or multiple [21, 191] burr holes to deliver the treatment directly into the brain.

Gene therapy involving a non–viral lipid-entrapped, polycation-condensed delivery system was used to deliver an adeno-associated virus (AAV)–based plasmids encoding recombinant ASPA. The therapy was used on two CD children (19 months and 24 months old, respectively), following promising data on HEK293 cells, Fischer rats and cynomolgus monkeys. The treatment was well tolerated and did lead to some clinical improvements and reduced NAA levels in the patients, although the NAA levels rose in the months after the therapy, indicating a drop in exogenous ASPA expression over time [23]. A later paper suggested the limited effects and transiency of the aforementioned trial – as well as a larger trial using the same approach (I.N.D.-7307)—to be primarily due to inadequacies of the vector or delivery system, instead advocating for AAV capsids as delivery vectors [191].

In 2003, an AAV vector encoding ASPA was tested on CD mice, showing reduced NAA levels and less spongyform degeneration. However, the vector did not achieve widespread CNS transduction, as areas distal to the injection sites were unaffected by the treatment [263].

Soon thereafter, an AAV2 vector expressing ASPA was tested on tremor rats, lowering their NAA levels and improving their balance and locomotion performance (as measured by a rotarod test) [264]. Other ASPA gene therapy trials in rats include successful treatment of absence-like seizures using adenovirus [265] and the use of a chimeric rAAV1/2 system, which restored ASPA activity for up to 6 months, reduced NAA levels and rescued the seizure phenotype, but did not affect gross brain pathology, such as dilated ventricles and spongiform vacuolization [42].

Following these tests, clinical trials on humans using an AAV2 vector was performed, reporting minimal systemic signs of inflammation or immune stimulation in all subjects [21]. Although a subset of the subjects (3 out of 10), were found to have, low to moderately high levels of AAV2 neutralizing antibodies relative to baseline [21], a follow-up study found no long-term adverse events related to the AAV2 vector [24], while demonstrating a long-lasting reduction in NAA levels and slowed progression of brain atrophy [24].

Further advancement came with the utilization of glia-specific promotors, to induce ASPA expression in glia cells. In 2013, it was shown that the GFAP promoter is highly specific for astrocytes following vector infusion to the brain of neonates and adult mice. In contrast, the MBP promotor, although unspecific in neonates, was specific for oligodendrocytes in 10 days old mice [266]. Later on, these promotors were used in mice along with AAV-vectors to restore ASPA activity specifically in astrocytes [226] and oligodendrocytes [41], respectively.

Likewise, microRNA (miRNA)-mediated post-transcriptional detargeting was used in one study to limit ASPA expression in off-target cells [107]. This process involves the inclusion of miRNA-binding sites in the ASPA-encoding cassette, which are targeted by miRNA specific to peripheral tissues, thus limiting the expression in these tissues.

AAV vectors with tropisms for oligodendrocytes (Olig001) have also been developed [267], and tested on neonatal CD mice [229], and 6 weeks old CD mice [268], which exhibit more progressed CD symptoms similar to what is observed clinically. It should be noted, however, that despite the improved binding to oligodendrocytes, transduction of other cell types still occurs [268]. While both viral tropism and choice of promoter should be optimized for the targeted cell type to maximize gene therapy efficacy, some expression in other tissues may not pose any issues or be neglectable compared to the benefits. At the very least, restoring ASPA activity in astrocytes of CD mice, seems to cure the disease without any apparent downsides [226].

Seeing the potential of reducing NAA-synthesis as a treatment of CD, a 2022 study opted for a combined gene therapy treatment that expressed ASPA and knocked down Nat8L, demonstrating its ability to reverse CD in 12 weeks old mice and advocating for its potential in treating more progressed cases of CD [269]. A recent human gene therapy trial involves a rAAV9 to deliver transgene ASPA regulated by a modified chicken β-actin (CB6) promoter [270]. Notably, the trail entails simultaneous systemic and intracerebroventricular injections, and the immunosuppressive drugs (Rituximab and Sirolimus) to prevent an immune response against AVV.

Currently (December 2023), two clinical trials on CD patients are ongoing (ClinicalTrials.gov Identifier: NCT04998396 and NCT04833907). NCT04833907 involves a single dose, intracerebroventricular injection using the olig001 vector capsid in up to 24 CD children, whereas NCT04998396 uses an intravenous injection of AAV9-vector to deliver transgene ASPA with a ubiquitous promoter to induce ASPA expression in both neuronal and non-neuronal cell types in up to 18 patients. One issue of gene therapy-based treatment of CD is to get widespread effect in the whole CNS, which likely is necessary to restore normal brain function. Consequently, the AAV9 is of particular interest as it has been shown to pass the blood brain barrier in mice, leading to widespread transduction of the CNS, while simultaneously being less invasive than intracranial delivery [211, 271].

Another promising approach to cure CD is the use of human induced pluripotent stem cells combined with either lentiviral integration of functional ASPA alleles to compliment or, more ideally,—homologous recombination to correct—nonfunctional ASPA alleles. Using this approach, induced pluripotent stem cells (iPSC) from CD patient fibroblasts, were engineered to express wild-type ASPA and differentiated into either neuronal progenitor cells (NPCs) or oligodendrocyte progenitor cells (OPCs) and then engrafted into immunodeficient Rag2−/− CD mice [22, 272]. Later experiments included the development of hypoimmunogenic human iPSC-derived OPCs [273]. If successful, these cells could be used as universal donors to treat CD patients, eliminating the requirement for autologous CD patient cells, inducing pluripotency, reconstituting ASPA expression and differentiating the cells into OPCs. At the time of writing, however, no clinical trials on CD patients using iPSC with reconstituted ASPA are ongoing, although the principle has been applied to CD mice [274]. One challenge of the stem cell approach might be to ensure ASPA activity is restored evenly throughout the brain, which is arguably more feasible with conventional gene therapy. While zones of reconstituted ASPA activity might be sufficient to deal with the elevated NAA levels, this might be insufficient to completely cure CD. Likewise, when validating the efficacy of gene therapies, one should note that urinary NAA levels, although easily obtained, may not reflect the severity of the disease [275]. Thus, urinary NAA levels should be accompanied with other measurements when assessing the efficacy of a given ASPA gene therapy treatment.

While reconstitution of functional ASPA activity resolves the cause of the disease, it does not miraculously fix the secondary pathologies of CD such as demyelination and vacuolization. Hence, like many other genetic diseases [276], early intervention is pivotal if the CD patient is to maintain normal development.

This point has been emphasized though multiple studies: In CD mice, initiating triheptanoin treatment in juvenile mice (28 days old) instead of neonates resulted in markedly more modest beneficial effects, indicating a window of therapeutic intervention that corresponds with developmental myelination [15]. Late onset of treatment was also pointed out as the most likely explanation for better effects of treatment with glyceryl triacetate in CD mice compared to CD patients [244]. Likewise, the results of a single dose rAAV vector ASPA gene therapy treatment of CD mice, correlated with treatment time, with injections as late as postnatal day 20 achieving efficacious and sustained improvements [107]. Human gene therapy studies only corroborate this point: In the first CD gene therapy trial, stronger effects of treatment were observed in the 19 months old patient vs the 24 months old [23]. The notion was verified in the later trial where it was noted that, although the therapy changed radiographical disease progression, when given between the first 4–83 months of age, the greatest improvements were seen in subjects treated within the first two years of life. The authors put the ideal window of intervention at the range of 0–3 months [24].

Diagnosis of Canavan disease

With promising prospects for a gene therapy-based cure of CD, and new trials currently ongoing, an important hurdle to overcome seems to be the timely diagnosis of CD, as to administer the therapy within the optimal time window. Prenatal testing for CD can be done by enzyme activity assays, amniocentesis (followed by measurements of NAA levels) or genetic testing. ASPA activity has been measured in amniocytes or chorionic villi (obtained from chorionic villus sampling) [277, 278].

Activity assays based on amniocytes have been applied [279], but is not recommended due to the low activity found in control amniocytes [278, 280, 281]. Rather, chorionic villi should be used as their ASPA activity is 10 times higher than in amniocytes, and comparable to the levels observed in fibroblasts (which were originally used to assay ASPA activity) [280]. Due to risk of false negative results and maternal contamination of the chorionic villus samples, activity assay should be accompanied by amniocentesis to estimated NAA levels [280].

Amniocentesis can be taken between weeks 16–18 [280] to determine amniotic fluid N-acetylaspartic acid levels [278]. But this is not always perfect either, as slightly elevated NAA levels are harder to interpret [278, 282].

Consequently, DNA diagnosis is the method of choice for prenatal diagnosis of Canavan disease [180, 278, 282,283,284], ideally with the use of polymorphic markers to rule out maternal cell contamination [278, 284]. However, assuming detection of the disease early postnatally is sufficiently early to enable timely intervention of gene therapy, increased urinary NAA is a reliable marker for CD, especially to distinguish it from other leukodystrophies [88]. Indeed, when looking at newer publications on CD, diagnosis of the disease is determined by either detection of elevated NAA in urine or blood, magnetic resonance spectroscopy (MRS), magnetic resonance imaging (MRI), or clinical symptoms indicative of CD. Most often, genetic testing is used as well to confirm the diagnosis [74, 270, 285].

Notably, DNA diagnosis can be streamlined with DNA-based screenings of other monogenic diseases, thus eliminating the need for specialized equipment and personnel for detecting NAA levels. Additionally, improvements in sequencing techniques and reduced costs further favor the use of DNA based diagnosis of CD patients. Notably, DNA based diagnosis also enables carrier screening of parents and preconception counseling. This combined with in vitro fertilization allows the selection of healthy zygotes.

According to the Canavan foundation website (https://www.canavanfoundation.org/), genetic screenings for CD was historically recommended to Ashkenazi Jewish descendants and families with a history of CD. In the Western world, screening of newborns have been used for decades to test for between 1 and 30 conditions [286] and over the years, the list of diseases tested for through expanded carrier screening (ECS) [287] and newborn screening (NBS) [288] has been growing. Thus, considering the severity of CD and the promising prospects of a gene therapy-based cure, it seems plausible that screening for CD on a wider scale may be implemented in the future.

Prevalence of pathogenic ASPA variants

Due to low disease allele frequencies and a recessive inheritance pattern, Canavan disease has a low prevalence. However, within the Ashkenazi Jewish population, carrier frequencies are much higher, corresponding to a frequency of 1:38 to 1:82, depending on the source [19, 34, 105, 289,290,291,292,293,294,295]. Amongst this population approximately 97–98% of the disease-causing alleles are attributed to the alleles E285A (~ 83%) and Y231X (~ 14%) [19, 34, 73, 88, 293, 296], indicating a founder effect within this population [11, 34, 297]. On a smaller scale, another founder effect in an Indian community due to population bottleneck and isolation has also been reported, involving the G176S variant [298].

Outside the Ashkenazi Jewish population, pathogenic variants often arise due to de novo mutations or are confined to single families or small geographical areas [93, 299]. However, the European associated A305E variant is common, accounting for about 40% of all disease alleles in the non-Ashkenazi Jewish population [34, 43, 73, 93, 297, 299,300,301]. As of now, according to the Simple ClinVar database [302] (accessed December 2023), 72 pathogenic or likely pathogenic ASPA variants are known. Of these, 22 (31%) are missense variants, while the remainder are deletions, frameshifts, splice variants, etc. (Fig. 6A). The missense variants are spread throughout the ASPA coding region (Fig. 6B), consistent with the majority leading to a structural destabilization of the ASPA protein [25, 52]. Finally, when focusing on non-synonymous coding variants, most ASPA variants are pathogenic (61%), while only 8% are benign, and the remainder (31%) are variants of uncertain significance (VUS) (Fig. 6C).

Fig. 6
figure 6

Reported ASPA gene variants. A The pathogenic and likely pathogenic ASPA variants reported in Simple ClinVar (https://simple-clinvar.broadinstitute.org/) [302] distributed between the indicated types of variants. Note that the missense variants represent the largest single class of variants. B Localization of the pathogenic and likely-pathogenic (red) missense variants and benign (green) missense variants on the ASPA primary structure. Note that these are roughly evenly distributed and do not cluster to specific regions. C Of the missense variants, most (45%) are pathogenic or likely pathogenic (red), while 12% are benign or likely benign (green), and the remaining (43%) are variants of uncertain significance (VUS) (grey)

Genotype–phenotype correlations

In recent years, the increased speed and reduced costs of DNA sequencing has led to a dramatic surge in the number of observed gene variants. Although most genetic variants are likely to be harmless, there is often insufficient evidence to classify newly observed variants as being pathogenic or benign, and these are therefore often designated as variants of uncertain significance (VUS) [303, 304]. In the case of ASPA, 33 gene variants are currently classified as VUS in the Simple ClinVar database (accessed June 2023) [302, 305]. The gnomAD database (accessed June 2023) [306] reports 324 ASPA gene variants with either conflicting, unknown or without clinical classification. These large numbers of VUS pose a problem for diagnosis and genetic counseling of individuals or families that carry these and yet unknown variants. Moreover, with the promise of gene therapy, incomplete clinical classification of ASPA variants may constitute a barrier in the way of treatment.

Since gene variants that result in deletions, frame shifts, early stop codons or that affect mRNA splicing will typically result in dramatic changes in the encoded protein, these can often readily be assigned as pathogenic. However, in case of missense variants, where one amino acid residue is exchanged with another, the variant effects may be more subtle, but can range from increased activity to complete loss of function. Thus, genotype–phenotype predictions for missense variants are often more challenging and accordingly missense variants account for 21 of the 33 currently classified VUS in Simple ClinVar.

Traditionally, clinical classification of VUS would be assessed in a one-by-one manner, involving meticulous laboratory and animal studies, yielding highly accurate results, but also often detailed mechanistic insights on why a certain variant results in the observed phenotype (Fig. 7). However, considering the number of observed gene variants, this approach is not feasible in terms of time and costs. Recent developments in sequencing and molecular cell biology have allowed rapid functional assessment of all possible, both known and yet unknown, gene variants. These so-called deep mutational scanning (DMS) or multiplexed assays of variant effects (MAVE) assays (Fig. 7) [304, 307,308,309,310,311], along with continuous improvements in computational predictions [312,313,314,315], have shown great promise for VUS classification for a range of monogenic diseases, including CD [25].

Fig. 7
figure 7

Methods for probing genotype–phenotype correlations. Genotype–phenotype correlations can be based on three different types of experimental setups, each with their own pros and cons. (Upper panel) In low throughput experiments, selected variants can be analyzed e.g. by animal studies, in primary cultures or by in vitro enzymatic and/or biophysical assays on purified protein. The results of such experiments are typically highly detailed and precise. However, determining variant effects in this manner is typically time consuming and expensive, and accordingly only a limited number of variants can be assessed in this manner. (Middel panel) Recent developments have allowed for high throughput measurements of variant effects. These so-called MAVEs typically probe either enzyme activity, protein abundance or protein–protein interactions. The advantages of these approaches are that they can inform on thousands of gene variants and depending on type of the assay, they can also provide mechanistic detail (e.g. reduced variant abundance indicates that the variant causes a reduced structural stability). However, in comparison with the low throughput analyses, the obtained results are less precise. (Lower panel) Computational predictors of variant effects typically rely on phylogenetic conservation or structural data. They are rapid and scalable to millions of variants. However, they do not typically provide mechanistic detail and are still imperfect predictors of pathogenicity. Figure compiled using Inkscape (v1.3). Parts of the figure were made using BioRender.com and PyMOL (v2.5.2) using the PDB entry 2O4H [44]

Using the variant abundance by massively parallel sequencing (VAMP-seq) technology [311], we recently determined the relative abundance of 6152 out of the 6260 (~ 98%) possible single-site missense and nonsense ASPA variants in cultured human cells [25]. Combined with computational predictions based on the phylogenetic conservation and structure of ASPA protein, the results showed that many pathogenic ASPA protein variants are structurally unstable, thus rendering them susceptible to the intracellular protein quality control and degradation systems which in turn leads to insufficient amounts of ASPA protein. Accordingly, those VUS and yet unidentified ASPA variants that are observed at low abundance are likely to be pathogenic. Of the missense VUS currently listed in ClinVar, about 50% display a low abundance [25]. Conversely, those ASPA variants that were found at normal levels can be either pathogenic or benign, and with protein abundance as the sole read out it will not be possible to classify these without testing the enzymatic activity of the variants.

In recent years we have seen tremendous progress in computational tools for prediction of variant effects. In comparison with studies on individual variants and high throughput technologies, the computational tools provide a quick, inexpensive and scalable alternative (Fig. 7) [316].

An early computational variant effect predictor that is still used today, is Sorting Intolerant From Tolerant (SIFT) [317], which is based on sequence homology. Since mutations occur randomly in all species and harmful variants are removed from the gene pool, variants at conserved positions in multiple sequence alignments (MSAs) of orthologous proteins are more likely to be pathogenic than variants at non-conserved positions. Accordingly, sequence conservation is fundamental to most computational variant effect predictors. We recently applied Global Epistatic Model for predicting Mutational Effects (GEMME) [313] to ASPA [25]. Like SIFT, GEMME is based on MSAs, but also accounts for co-variation of amino acid pairs and proves efficient in identifying CD-linked ASPA variants [25, 313]. Another recent MSA-based tool is the Evolutionary model of Variant Effect (EVE), which is based on unsupervised deep learning trained on amino acid sequences of over 140,000 species [312]. The EVE website (https://evemodel.org/, accessed March 2024) includes saturated predictions for ASPA and correctly assigns 14/17 pathogenic or likely-pathogenic and 3/4 benign or likely benign ClinVar missense ASPA variants. On the same set of variants, another recent predictor, AlphaMissense [315], misclassifies two other variants. Since those two variants (P257R and G274R) display very low abundance in the described VAMP-seq analyses [25], this reflects how DMS combined with computational predictors may result in very high accuracies.

While the computational variant effect predictors have become highly accurate, they are still imperfect. In addition, most computational tools do not provide mechanistic insight. Thus, they only provide information on which variants are likely pathogenic, and do not inform on why a particular variant is harmful. This is because residues that are conserved in MSAs can be conserved because they are critical for e.g. catalysis or substrate binding, but also for folding and stability of the protein structure. Parallel computational predictions of a protein’s structural stability using tools such as FoldX [318] or Rosetta [319] that use the crystal structure of the protein as input (rather than an MSA), have shown some success in untangling such effects [320, 321]. However, these structure-based computational protein stability predictors are also imprecise [322], as they for instance do not take into consideration the folding pathway of the protein.

Finally, as mentioned above, MD simulations offer a computationally more demanding approach to variant effect predictions which also provide information on the variant mechanisms and have with varying success been applied to ASPA. Previously, we tested the effect of the pathogenic and low abundance C152W variant with MD simulation and found no substantial difference to wild-type ASPA on the timescales that could be probed [52]. Utilizing that the structure of four ASPA variants (K213E, Y231C, E285A and F295S) has been solved [108], the Nemukhin group applied MD simulations to predict that two pathogenic ASPA variants, Y231C and F295S, may limit access to the active site [323]. This potentially explains the lower enzyme activity measured for these protein variants [63], although the pathogenicity of these variants is likely also explained by very low abundance [25]. Additional computational studies of aspartoacylase variants using MD simulations have been performed by the Zayed group [324,325,326]. Using MD simulations, they were able to show that the V31F variant caused conformational changes, affecting catalytic residues in the near vicinity (H21, E24, R63), and reducing the structural stability [324]. Later, the same group found the two pathogenic variants P183H and P183L to have consequences for the structure of ASPA [325]. This is in accordance with our VAMP-seq analyses, where all of these variants display reduced abundance [25].

Lastly, in 2017, Doss and Zayed [326] applied MD simulations and molecular docking to investigate the impact on structure and NAA-binding for four variants (K213E, Y231C, E285A and F295S), using the available crystal structures. Two additional variants of unknown significance (I143V and V186D) were investigated for effects on the structure [10, 327]. All variants were distinguishable from wild-type ASPA, and displayed a lowered compactness, reduced number of intramolecular hydrogen bonds, reduced binding to NAA and poorer accommodation of the Zn2+ ion, as calculated using the Protein–Ligand Interaction Profiler [328]. These data indicated that K213E and I143V were benign whereas V186D, Y231C, F295S and E285A were pathogenic to various degrees [326]. These data were corroborated by a range of variant effect prediction tools, including PANTHER, PhD-SNP, SIFT, SNAP, and Meta-SNP, the stability prediction tools iStable server [329] and evolutionary conservation using Consurf [330]. These variants, except for K213E, E285A and I143V, display strongly reduced abundances [25], suggesting that they affect ASPA folding and stability.

For additional information of computational tools for variant effect predictions we refer to these recent and excellent reviews [316, 331,332,333] and note that as more proteins are analyzed by DMS techniques, the computational tools are likely to improve further and may also offer mechanistic insights into the molecular effects of missense variants.

Outlook and concluding remarks

As mentioned, gene therapy is currently one of the more promising attempts at curing Canavan disease [23, 24]. However, due to the highly progressive nature of CD, gene therapy-based interventions will likely need to be administered early. Accordingly, rapid and accurate clinical assessment of novel ASPA variants will be critical, and high-throughput assays and computational approaches to gain comprehensive genotype–phenotype information on all possible ASPA variants are therefore warranted. The recent application of the VAMP-seq technology to ASPA revealed that the major molecular mechanism of the ASPA insufficiency in CD is coupled to a reduced structural stability of the ASPA protein variants [25], and the comprehensive dataset can thus be used in the clinical assessment of variants currently annotated as VUS, but also inform on novel variants that have not yet been encountered in population sequencing. However, as these findings effectively categorize CD as a protein misfolding disorder (proteinopathy), this also potentially allows for novel small molecule-based therapeutic approaches. Presumably, many structurally unstable CD-linked missense ASPA variants will still retain some, albeit insufficient, catalytic function, and can thus be categorized as so-called hypomorph alleles. Accordingly, it should in theory be possible to resuscitate such variants by boosting the cellular levels [334, 335]. Potentially, this can be achieved by increasing synthesis, blocking degradation or by stabilizing structurally unstable variants [334], and further studies on potential transcriptional regulation of ASPA and the degradation of ASPA missense variants are therefore a priority. Hopefully, the recent characterization of the degradation of ASPA protein variants [25, 52] shows that ASPA is an ideal model substrate for analyzing the cellular PQC system and may inspire additional studies on ASPA protein folding and degradation. As for stabilizers or folding correctors, these could in principle be any small molecule adept at tightly and specifically interacting with native ASPA [335], without interfering with its catalytic function. Upon binding, the stabilizer would lock unstable ASPA variants in the native conformation, thus averting the PQC-linked degradation and resulting in increased levels of functional enzyme. Indeed, such small molecules have been developed for other hereditary protein folding diseases, most notably cystic fibrosis [336].

As we highlight in the above, another area for further research into ASPA and Canavan disease is on the pathophysiological mechanisms of the disease. The main suggested mechanisms of pathogenicity are not mutually exclusive, and likely both contribute to disease progression. Further experiments, including animal studies, may reveal the relative importance of the proposed mechanisms, which in turn may pave the way for better treatments, and we hope the present literature survey may aid the progress of research aimed at deepening our understanding of ASPA and haste the path towards effective treatments of Canavan disease.