Background

Mitochondria play an important role in energy metabolism, free-radical generation, and apoptosis [1]. Unlike other organelles, mitochondria are encoded and regulated by a specified genetic architecture that includes the mitochondrial genome and nuclear-encoded mitochondrial protein (NEMP) genes. Unlike genomic DNA (gDNA) which is inherited from both parents, mtDNA is maternally inherited and is composed of ~ 16.6 Kbp of DNA in a plasmid/circular structure that encodes 2 rRNAs, 22 tRNAs, and 13 polypeptides that perform structural or enzymatic functions in the mitochondrion, with the remaining mitochondrial components encoded by the ~ 1200 NEMP genes. Mutations within mtDNA that may affect mitochondrial function have previously been associated with neurological disease, drug toxicities, myopathies, and some cancers [2,3,4]. While mtDNA plays a major role in diseases associated with mitochondria, the NEMP genes can also contribute to all these disease states.

Stroke and dementia pathologies have also been linked to dysfunctional mitochondria. This has been seen through mitochondrial response, excitotoxicity, acidotoxicity, protein misfolding, and inflammatory reactions [5, 6]. A genetic contribution to stroke and dementia pathologies has also been observed. This is evident with monogenic causes of cerebral small vessel disease (CSVD), such as such as Cerebral Autosomal Dominant Arteriopathy with Sub-cortical Infarcts and Leukoencephalopathy (CADASIL), a condition which causes stroke, dementia, migraine, and epilepsy. CADASIL is a monogenic condition caused by cysteine-altering mutations in NOTCH3, where altered mitochondrial proliferation, apoptosis, and overall function have been observed [7].

Furthermore, pathogenic variants identified in mtDNA and NEMP genes have also been shown to cause stroke and dementia phenotypes, in particular in conditions such as mitochondrial encephalopathy with lactic acidosis and strokes (MELAS) [MIM # 540000] and myoclonic epilepsy with ragged red fibres (MERRF) [MIM# 545000]. Symptoms in MELAS include stroke-like episodes, diminished intellectual and cognitive functions, and/or headaches; however, it also include seizures, sensorineural hearing loss, diabetes mellitus, cardiomyopathy, and gastrointestinal dysmotility, whereas MERRF symptoms include generalized epilepsy, ataxia, weakness, and dementia [8, 9]. For both conditions, the majority of variants have been identified in specific mitochondrial encoded genes (MTTL1, MTTQ, MTTH, MTTK, MTTC, MTTS1, MTTND1, MTND5, MDTN6, and MTTS2); however, variants in POLG, which is one of the NEMP genes, have also been identified to cause MELAS [10]. As mitochondrial irregularities are evident in CADASIL-derived cells, mtDNA and NEMP genes were investigated to identify a potential novel cause of disease in patients referred for CADASIL testing, but which had no pathogenic NOTCH3 variants.

Methods

Patient Cohort

Blood samples were collected from clinically referred CADASIL patients (n = 50) who had no pathogenic NOTCH3 variant, but had signs and symptoms indicative of a monogenic CSVD including recurrent stroke and/or vascular dementia with no other risk factors. As these participants were originally referred from neurologists from around Australia, it was assumed that they all had strong clinical characteristics indicative of CADASIL. In cases where little to no clinical information available, further requests from the referring physicians were completed. Participants who still had no clinical information given were assumed to have a “possible” clinical diagnosis of CADASIL by criteria which includes age of onset ≤ 65 years old, family history of stroke and/or dementia, lack of/mild vascular risk factors, MRI which included white matter change [11, 12]. This resulted in a M:F ratio of 1:2.3 within the cohort with a mean age of 52.2 (± 13.1). All patients had approved diagnostic testing for CADASIL with their doctors, and ethical approval for this study was obtained through the QUT HREC along with appropriate consents for the patient cohort (Approval Number 1800000611). Diagnostic testing was performed using the Genomics Research Centre (GRC) custom 5-gene panel where only the notch receptor 3 (NOTCH3) gene was analysed [13].

Whole Exome Sequencing

We used an initial input of 80 ng of genomic DNA for whole exome sequencing (WES) using Ion AmpliSeq Exome RDY-kits (Carlsbad, CA, USA) for library preparation, according to manufactures’ instructions (MAN0010084). Completed libraries from samples were quantified using an Invitrogen Qubit 3 Fluorometer (Venlo, Netherlands) and combined at equimolar concentration of 100 pM prior to sequencing. Template preparation, enrichment, and chip loading were performed using the Ion P1 Hi-Q Chef Kit (Cat. Number A26433) or Ion 540™ Chef Kit (Cat. Number A30011) and 540 Chips on the ThermoFisher Scientific Ion Chef (Carlsbad, CA, USA) targeted at 200-bp lengths. Sequencing was performed using the Ion Proton and Ion S5 + platforms with sequencing alignment (Hg19) and variant calling completed via the Ion Torrent software (Carlsbad, CA, USA).

Mitochondrial Sequencing

Mitochondrial sequencing (MitoSeq) was also completed using the Ion Torrent sequencing system based on a protocol developed by the Genomics Research Centre [14]. This work involved performing two separate long-range PCRs (labelled fragment 1 and fragment 2) on each sample to amplify the mitochondrial DNA. Clean-up of the PCR product was completed using the QIAquick PCR clean-up kit (QIAGEN, Melbourne, Australia) and the resulting products were quantified using Bioanalyser 12,000 Chips (Agilent, Melbourne, Australia). Fragments were then diluted to equimolar concentrations of 20 ng/uL and pooled for each sample. Sonication of each sample with the pooled fragments 1 and 2 was completed on 100 ng of input DNA in 51µL of solution. End repair of the sonicated samples was completed using the NEBNext library preparation for Ion Torrent sequencing kit and BIOO scientific (Adelaide, Australia) barcodes were used to label each sample. The E-gel size selection system with 2% agarose gels was used to select for fragments between 150 and 300 bp in size. A final amplification of these fragments using the NEBNext library preparation for Ion Torrent sequencing was completed and final libraries were quantified on using Bioanalyser 1000 DNA kits or Bioanalyser HS DNA kits if there was a lower-than-expected library yield. Completed libraries were then diluted to 26-pM concentration and different barcodes were pooled to be run on Ion 530 chips in conjunction with the Ion Chef system, according to manufacturer instructions. Following template preparation and amplification, the chips were loaded and sequenced on the Ion S5 + .

Analysis

Extraction of the rare and functional NEMP gene variants was completed through merging the WES variant call format (vcf) files using the bcftools vcf-merge function. VEP was used to annotate the merged vcf and VEP-filter was used to extract the functional and rare NEMP gene variants [15,16,17]. This involved thresholds of SIFT scores ≤ 0.05, PolyPhen scores of ≥ 0.8, and a global MAF (gnomAD) of < 0.001. Variants were then further annotated using VariantTaster and PredictSNP2 and stratified according to the number of in silico tools which classified the variant as disease-causing/deleterious. Variant databases such as ClinVar and dbSNP were then used to identify if any variants had previously been found to play a role in disease. Gene ontology for each variant was investigated focusing on tissue distribution of mRNA and protein expression. Samples were removed if there was no protein expression identified through the Protein Atlas Database in smooth muscle cells and neurological tissue. Boolean searches based on gene of interest in Google and NCBI PubMed (https://pubmed.ncbi.nlm.nih.gov/) for Vasc*, Neur* Leukodystrophy, encephalopathy, stroke, and dementia were used to further identify candidate variants. Information based on the Boolean searches was used to identify the remaining candidate variants of interest from the mitochondrial genome and the NEMPs which were then confirmed using Sanger sequencing.

For the MitoSeq data, variant annotation identification was completed through MitoMaster software (https://mitomap.org/foswiki/bin/view/MITOMASTER/WebHome) using MitoMAP. Haplogroups for each sample were investigated using HaploGrep2 (https://haplogrep.i-med.ac.at/category/haplogrep2/) [18]. This initial annotation in MitoMAP identifies MitoTIP scores and further annotation of the merged-vcf data was completed using the MSeqDR mvTool [19]. This software identified population allele frequency; disease phenotypes derived from HmtDB, MitoTIP, MSeqDR, and ClinVar; genomic annotations that were derived from VEP; in silico pathogenicity annotations derived from dbNSF and CADD and finally HmtDB. Only variants with a QUAL score > 20, coverage depth (DP) > 10, and a MAF < 0.001 were considered for pathogenic findings. Variants were excluded if they had previously been classified as polymorphic, likely benign, or benign. Remaining variants were then run through the MitIMPACT 3D (http://mitimpact.css-mendel.it/) tool with further annotation using disease phenotypes. Further filtering initially focussed on genes which had previously been identified with MELAS and/or MERRF and then variant that had previously been identified as pathogenic/likely pathogenic. Finally, remaining candidate variants which were not previously identified as causative of stroke and/or dementia disorders were investigated for gene ontology, gene/protein expression, and function.

Results

Mitochondrial Sequencing

The MitoSeq data showed good overall coverage with a median coverage > 1000 × . The high coverage depth also showed a high sequence quality generated with most reads showing a mean sequence Phred score > 25. Furthermore, the quality of nucleotides based on the size of the fragments across samples also shows Phred scores > 25 for all samples with reads up to 250 bp. Across all samples, the read lengths were typically between 100 and 250 bp in size, with most samples having a read length distribution ~ 200 bp. The merged vcf file identified a total of 1178 variants with 302 unique variants across the 50 samples. In conjunction with the extracted mitochondrial reads, there were no variants identified in this sequence data that have previously been identified as causative of MELAS, which included the m.3243G > A variant.

The 302 variants identified from the MitoSeq data were filtered according to the MitoMap minor allele frequency, predicted pathogenicity (MitoMaster, HmtDB), and clinical databases (ClinVar, dbSNP, HmtDB, and MitoMaster). From this, it was identified that there were 84 variants with a MAF < 0.01 (Supplementary_Material_1) which included only 27 variants with a MAF < 0.001. Investigation of all mitochondrial variants using MitoMap in silico tools failed to find sufficient evidence that any would be causative of CSVD. Despite this, the HmtDB pathogenicity prediction tool identified 9 homoplasmic variants as pathogenic/likely pathogenic (Table 1). Whilst it is unlikely these variants were causative of CSVD, additional filtering strategies were conducted. This excluded two variants (MT-ATP6 NC_012920.1:m.9055G > A and MT-CYB NC_012920.1:m.15218A > G) which had a MAF > 0.01 and a further four variants (m.4295A > G, m.4640C > A, m.8616G > T, and m.12634A > G) due to a MAF > 0.001. Three variants (MT-ATP6 NC_012920.1:m.8605C > T, MT-ND5 NC_012920.1:m.13468C > A, and MT-ND5 NC_012920.1:m.13651A > T) were predicted as pathogenic through HmtDB and also had a MAF < 0.001. Investigation into heteroplasmy failed to identify any change predicted to have clinical significance.

Table 1 MitoSeq variants with a predicted pathogenicity of likely-pathogenic to pathogenic based on the HmtDB pathogenicity tool

WES NEMP Investigations—MELAS-Related Conditions

The assessment of NEMP gene variants identified n = 82 variants which passed QC parameters and had a benign/tolerated rating in < 2 of the in silico pathogenicity tools used as well as a MAF < 0.001. All variants were heterozygous and amino acid altering. Filtering based on Boolean searches and mRNA and protein expression data obtained from the Protein Atlas (where applicable) identified n = 29 gene variants across the 50 samples detected in the NEMP genes potentially associated with CSVD related pathology or causative of conditions which clinically overlap with CADASIL/CSVD symptoms.

Investigations based on these Boolean searches identified three genes which have been linked to MELAS pathology. The first was a heterozygous missense variant in POLG ENST00000268124.5 c.2209G > C p.Gly737Arg (rs121918054). This variant was identified and classified as Likely-Pathogenic—Pathogenic in ClinVar (VCV000013513) associated with multiple mtDNA disorders. However, it must be noted that the majority of these disorders were recessive conditions and that this variant was most likely only classified in this way under the presence of a second mutation. There was another heterozygous variant in a gene which has been associated with MELAS-like disease in FASTKD2 ENST00000236980.6 c.965C > T p.Pro322Leu (rs748507111). However, the MELAS phenotype associated with gene has been identified to have an autosomal recessive inheritance pattern. The third gene identified and associated with MELAS was MTO1 from which there were n = 2 separate heterozygous variants ENST00000370300.4 c.350G > A p.Arg117His (rs763571382) and ENST00000370300.4 c.1505G > A p.Arg502His (rs201544686).

WES NEMP Investigations—Encephalopathies

There were n = 14 heterozygous variants in genes which have previously been identified to cause encephalopathy; however, all but two were in genes where encephalopathy is associated with an autosomal recessive inheritance and/or childhood/neonatal presentation (Supplementary 2). The remaining two heterozygous changes were both identified in LONP1 (NM_001276480.1 c.2218G > A p.Val740Met and c.1288G > A p.Val430Met) (Table 2) which has previously been identified to cause an autosomal-dominant mitochondrial encephalopathy. While these particular variants have not been previously reported as disease causing, they have previously been identified to contribute to the CSVD-related phenotype in some patients [20].

Table 2 Candidate variants identified from NEMP genes which have previously been identified to either cause MELAS or have protein products altered in MELAS patients. This table also includes the most likely candidate variants that are also associated with encephalopathy (LONP1). In silico pathogenicity scores were obtained using VEP annotation and the PredictSNP2 tool and mRNA and protein expression information for smooth muscle and neurological tissues was obtained from GTEx

WES NEMP Investigations—Alzheimer’s Disease and Stroke

There were 7 heterozygous variants in genes which have been implicated to play a role in Alzheimer’s disease (AD) pathology either through functional protein studies or via genetic association studies (Table 3). This included NDUFAF6, NDUFB3, TCIRG, and BCKDK, which were all found to be associated with AD through either GWAS or meta-analysis studies [21,22,23,24,25]. Additionally, it was identified that HAGH protein levels were significantly higher in blood in AD patients with the characteristic Apoe-ε4 variant; there was increased expression of KIF1B in the brains of AD patients and an increase in blood methylation levels within the COASY gene [26, 27]. There was also two heterozygous variants identified were in genes which have been linked to play a role in ischaemic stroke, MFN1—through nitric oxide induced fission related to early ischaemic stroke events and PDK1—thought to play a role in Ca2+-derived activation of platelets contributing to platelet aggregation and ischaemic stroke risk.

Table 3 Genetic variants in the CADASIL-related CSVD cohort in genes from which studies have previously identified a link between the gene loci/function and Alzheimer’s disease. In silico pathogenicity scores were obtained using VEP annotation and the PredictSNP2 tool and mRNA and protein expression information for smooth muscle and neurological tissues was obtained from GTEx

Discussion

Mitochondrial Sequencing Investigations

Initial investigations of known disease-causing variants in the MitoSeq data failed to identify any changes that would result in a known CSVD or neurodegeneration condition such as MELAS. Further investigations focussing on the predicted pathogenic and likely pathogenic variants found 9 variants. The m.4295A > G (MT-TI) variant, seen in DGR349, was previously been found to be associated with maternal hypertension and maternal sensory hearing loss [28]. Another case study has also found this variant in an individual who suffered an occipital stroke [29]. The MAF of this variant is also > 0.001 across multiple population databases, making it unlikely that it is causative of CSVD. Despite these findings, there is little evidence to suggest a role for this variant in CSVD or neurodegeneration phenotypes.

In DGR327, there was the m.8605C > T variant which affects MT-ND2. This gene is one of the seven mitochondrial genes encoding for subunits of NADH dehydrogenase where variants in this gene have previously been associated with Leber optic atrophy, mitochondrial complex 1 deficiency, and Leigh syndrome due to mitochondrial complex deficiency [30, 31]. With an allele frequency of 0.035, it is unlikely that this variant would be causative of a rare monogenic form of CSVD. Furthermore, there is an additional variant at this loci, m.8605C > A, which may indicate that this region.

Three variants were identified across seven samples in the MT-ATP6 gene (DGR366 m.8605C > T, DGR353 m.8616G > T, and DGR020, DGR339, DGR340, DGR344, and DGR349 m.9055G > A) that were considered likely pathogenic or pathogenic. For m.8616G > T and m.9055G > A, these variants both had MAF > 0.001 and were thus excluded as likely candidate causes for CSVD. While the m.8605C > T variant had a MAF < 0.001, it was also identified that this position is also known to harbour a different variant, m.8605C > A, which may indicate that this position is more prone to variation than current populations databases reveal. For this reason, it is unlikely that it could be considered a pathogenic cause of CSVD. Variants in MT-ATP6 have been shown to cause a number of symptoms such as ataxia, cognitive dysfunction, neuropathy, seizures, and retinopathy; however, it is most commonly known to cause mitochondrial complex V deficiency, Leigh syndrome, and/or neuropathy, ataxia and retinitis pigmentosa (NARP) [32]. These conditions primarily affect infants and young children, so it is unlikely that homoplasmic variants within this gene are a cause of CSVD pathology.

There were three homoplasmic variants identified as likely pathogenic or pathogenic in MT-ND5 across four samples (DGR324 and DGR337 m.12634A > G, DGR338 m.13468, and DGR037 m.13651A > T). Heteroplasmic variants in MT-ND5 have previously been identified in MELAS patients or Leigh syndrome and there have been some theories that this may be a novel hot spot for MELAS causing variants outside of the classical m.3243A > G variant [33,34,35,36]. However, none of the variants described in this literature match one that was seen in the CADASIL-related CSVD cohort. Due to the homoplasmic nature of the variants identified, it is likely that the clinical manifestations would not be limited to neurological and neurovascular phenotypes. Despite this, the m.13468C > A and m.13651A > T variants have not previously been identified in the Mitomap database. The m.13468C > A variant has a total incidence of 3.544 × 10−5 in gnomAD, while the m.13651A > T transition has not previously been detected in gnomAD.

The final variant of interest was the homoplasmic MT-CYB m.15218A > G which was identified in DGR069 and DGR070. Variants in this gene primarily are associated with Leigh syndrome and exercise intolerance; however, an infantile form of MELAS has also been caused by a 4-bp deletion affecting MT-CYB [37, 38]. As these conditions generally affect young children and infants, and there is a lack of evidence to support homoplasmic changes in this gene as a cause of CSVD or MELAS-like condition, it is unlikely that this variant is causative of disease. Furthermore, the m.15218A > G variant has an allele frequency of 0.014, suggesting that it is at too high of a frequency in the general population to be causative of a rare condition. Overall, there is little evidence at this stage to show that the mitochondrial variants on their own are causative of disease; however, there may be some minor contribution that would require larger scale studies to identify.

NEMP Genes Associated with MELAS or MELAS-Like Phenotypes

POLG encodes for the catalytic subunit DNA polymerase-ɣ enzyme that is solely responsible for mtDNA replication by conferring the proofreading activity of the enzyme [39]. Variants in this gene have been found to be associated with a wide range of phenotypes inherited in an autosomal dominant and recessive manner [40, 41]. This is also true for the variant identified in DGR349 (p.Gly737Arg) which has previously been identified and characterised in ClinVar (VCV000013513.13) and was associated with progressive sclerosing poliodystrophy mitochondrial DNA depletion syndrome 4B—MNGIE type (MIM# 613,662), sensory ataxic neuropathy-dysarthria-ophthalmoparesis syndrome (MIM# 607,459), and POLG-related spectrum disorders, seizures, and progressive external ophthalmoplegia with mitochondrial DNA deletions (autosomal recessive). Depletion in the enzyme activity of POLG may result in mtDNA depletion and/or multiple deletions due to the role POLG plays in mtDNA replication and related processes (e.g. proofreading). POLG variants have also previously been associated with MELAS and MELAS-like phenotypes. In addition, although a MAF of 7.5 × 10−4 with n = 211 allele carriers in the gnomAD database is within the threshold for a rare disorder, it is still considered somewhat high. Furthermore, the high clinical heterogeneity associated with POLG variants, and as no family studies were completed for DGR343, it is still unclear if this variant is causative or contributing to the CSVD phenotype. Furthermore, whilst the POLG Glycine 373 codon is highly conserved and thus computationally predicted as pathogenic through SIFT and PolyPhen2, it is most commonly associated with POLG-related diseases when there is the presence of a second mutation in the gene. Further functional testing would be recommended and further examination of the exome for DGR343 should be completed.

There was also a heterozygous variant identified in FASTKD2. This variant has previously been identified as likely pathogenic in ClinVar (VCV000214352.2) as causative of combined oxidative phosphorylation deficiency type 44 (MIM#612,322). However, variants in FASTKD2 have also been associated with an autosomal-recessive inherited form of MELAS [42]. Despite this association, it is unlikely that the heterozygous FASTKD2 variant identified in DGR345 is causative of a MELAS phenotype; however, further study into the effect this variant has on FASTKD2 function may identify a plausible role for this gene and its variants in CSVD.

Finally, two individuals (DGR023 and DGR025) were identified with separate variants in MTO1. The first variant identified in DGR023 results in a p.Arg502His variant (rs201544686) that has previously been cited as causing an autosomal recessive mitochondrial disease in ClinVar (VCV000089037.3). The second variant causes a heterozygous missense change that affects p.Arg117His (rs763571382), which has not been previously classified in ClinVar. Variants in MTO1 are recognised in OMIM as causing combined oxidation phosphorylation deficiency 10 (MIM# 614,702). What is interesting is that MTO1 encodes a protein involved in tRNA modification and protein synthesis. In MELAS, the two variants which have been estimated to cause ~ 90% of cases are m.3243G > A (80% of cases) and m.3271 T > C (~ 10% cases) both affect the Mitochondria tRNA Leucine 1 gene (MT-TL1) [43, 44]. Furthermore, MELAS patients with the m.3243G > A variant have been found to have a decreased expression of MTO1 due to the inducing miR-9/9* [45, 46]. The proposed mechanism in MELAS that causes this induction of miR-9/9* and subsequent downregulation of MTO1 relates to oxidative stress and increased levels of intracellular Ca2+ and reactive oxygen species (ROS). It would be interesting to investigate the cellular effects caused by the variants in DGR023 and DGR025 to see if this correlates with a decreased expression or activity of MTO1 resulting in a phenotype mimicking MELAS or related to CSVD.

NEMP Genes Associated with Encephalopathy

There were n = 15 variants identified across 12 genes previously associated with encephalopathy. This was the most prevalent neurological symptom identified through the Boolean searches; however, further investigation of the literature found that n = 11/12 genes were only linked to encephalopathy through biallelic loss or compound heterozygous variants to have an effect, which caused severe conditions often detected in infancy. Despite this, it is an interesting link as encephalopathy is a common symptom of CSVD and the aetiology of this may be related to mitochondrial dysfunction.

The remaining variants were both identified in LONP1 (DGR027 p.Val430Met; and DGR353 p.740Met). Interestingly, a recent manuscript identified a de novo dominant variant that causes a mitochondrial encephalopathy with neonatal seizures and death in a 1 year old based on a different variant in LONP1 [20]. Typically, biallelic variants in LONP1 cause cerebral, ocular, dental, auricular, and skeletal anomalies (CODAS) syndrome (MIM#600,373), a neonatal multisystem disorder where variants are located in the ATP-dependent protease LON-binding domain. These variants are thought to dramatically increase proteolytic activity of LONP1 [47]. The proteolytic activity of LONP1 is thought to play a role in modulating the abundances of proteins involved in processes relating to environmental cues [20, 48](Zurita Rendón & Shoubridge Eric, 2018). The variants identified in this CSVD cohort affect a different region of the gene and are outside of the key regulatory domains for this protein. Although located outside the functional domains of the protein, these variants may still contribute to the pathogenicity of CSVDs, particularly as these disorders typically manifest later in life. Furthermore, it could be theorised that heterozygous variants in LONP1 may result in a cumulative effect that causes a later-in-life encephalopathy/cognitive decline [20].

NEMP Genes Associated with Alzheimer’s Disease Pathology

There were 8 variants identified where the genes were found to be either associated with AD through genetic studies, or the gene products have been shown to behave differently in AD patients. The first gene variant which had an association with AD was the KIF1B p.Arg942Cys (rs542546734) variant. KIF1B was identified as upregulated in the brains of AD individual’s post-mortem, from which it was theorised that it may be playing a role not yet defined in AD [26]. Despite the reported expression change, there is still limited evidence to suggest a role of KIF1B in AD pathology and as such it is unclear if the variant detected in DGR021 is causative of CSVD. Furthermore, variants in KIF1B are documented as causing autosomal dominant inherited disorders including Charcot-Marie Tooth Disease type 2A1 (MIM# 118,210) and pheocychromatoma (MIM# 171,300). Based on other well-established links to disease in an autosomal dominant manner, it is unlikely that variants in KIF1B would be a novel candidate gene for CADASIL-related CSVD pathology.

Variants were also identified in DGR026 for OGDH p.Arg81Cys and DGR337 HAGH p.Thr111Ile, which have both been identified in studies showing that there are gene expression changes detected in AD patients [49, 50]. OGDH has been identified to be overexpressed and linked with a protective effect against externally added ROS [49]. Similarly, a GWAS identified a link between elevated levels of HAGH protein in APOE ε4 carrier patients [50]. It is unclear and unlikely that the variant detected in DGR337 also causes increased plasma levels of HAGH, as they do not carry the APOE ε4/ε4 genotype. As the change in HAGH expression was seen in APOE ε4 carrier patients compared to controls, we also cannot rule out that this was a result of an epistatic mechanism and as such a variant in HAGH by itself would not be sufficient to contribute to pathology.

Whilst changes in protein and gene expression have been identified in KIF1B, HAGH, and OGDH, association studies have also identified AD loci in NDUFB3, NDUFAF6, TCIRG1, and BCKDK [21, 23,24,25]. There is increasing evidence to support the association of these NEMP genes and AD and it has led to theories of the pathological mechanisms that link AD pathology with disrupted mitochondrial structure and function [6].

DGR344 was found to have a heterozygous NDUFB3 p.Trp22Arg variant. Interestingly, this variant has previously been found to cause Mitochondrial complex 1 deficiency (MIM 618,246) when it is inherited in a homozygous fashion [51, 52]. NDUFB3 encodes for an accessory subunit of the oxidative phosphorylation (OXPHOS) complex 1 which is the first enzyme in the electron transport chain in the mitochondria. A reduced expression of OXPHOS protein subunits has been identified and replicated in the blood of AD patients exhibiting mild cognitive impairment which has led to a theory that an imbalance in nuclear and mitochondrial-encoded OXPHOS transcripts may drive a negative feedback loop that reduces OXPHOS efficiency and increases neuronal damage by ROS [53, 54]. As OXPHOS efficiency is impaired in AD patients, further investigation of heterozygous carriers of this NDUFB3 variant needs to be completed before it could be considered as a candidate gene causative of an AD/CSVD phenotype.

DGR350 had a heterozygous variant in NDUFAF6 (p.Ala92Val), a gene which was recently identified as a novel locus associated with AD in a meta-analysis focussing on AD and vascular dementia patients [21]. The locus (NDUFAF6-rs10098778) identified by Moreno-Grau et al. (2019) is also in high linkage disequilibrium to another marker in the same gene NDUFAF6-rs4735340 which was the top signal in a separate meta-analysis which focussed on identifying novel risk loci in late-onset Alzheimer’s disease (LOAD) [55]. The NDUFAF6 p.Ala92Val variant that was detected in DGR350 has not been classified in ClinVar despite being previously identified by dbSNP and to date there are no functional studies completed on the gene for this position. Albeit, there have been studies that have found that biallelic variants in this gene cause autosomal recessive Leigh syndrome and is associated with mitochondrial complex I deficiency [22]. While Leigh syndrome is a progressive neurodegenerative disease, typically presenting in the first year of life, there is currently no link suggested between this condition and CSVD or AD pathology [56]. Based on this, it is unclear whether variants in NDUFAF6 may be causing the CSVD phenotype in DGR350 and further studies into heterozygous changes in this gene may be necessary to identify any late-onset related conditions associated with this gene.

There was a heterozygous TCIRG1 p. Met403Ile variant identified in DGR344. Interestingly, TCIRG1 has a separate missense variant (TCIRG1 11:67,810,477:C > T) that was identified to segregate in 3 separate families with early onset Alzheimer’s disease (EOAD) [24]. TCIRG1 is a component of the a3 subunit of H + -ATPase complex and is located within the lysosomes, where it is critical for the acidification of vacuoles which removes debris via the endolysosomal pathway. Interestingly, lysosomal function has also been theorised to aid in the disposal of the NOTCH3 extracellular domain (NECD) in wild-type and CADASIL patients [57]. In CADASIL, this may affect disease pathogenesis with a recent study suggesting that a dysfunction in the autophagy-lysosomal pathway may be a key component to the molecular mechanisms of CADASIL [58]. The TCIRG1 amino acid change in DGR344 (p.Met403Val) is two amino acids upstream from a vacuolar domain in the protein, and it may be that this variant disrupts the helical domain and alters TCIRG1 protein function.

The final variant identified was a heterozygous COASY p.Tyr396Cys (rs375845459) variant. This gene was previously identified as associated with AD through increased blood methylation profiles and reported as a potential biomarker for early diagnosis [27]. COASY encodes for coenzyme A (CoA) synthase and performs a key role in the biosynthesis of CoA from pantothenic acid [59,60,61]. It is primarily present in the mitochondrial matrix and variants within the gene are found to alter enzyme activity [27]. Homozygous variants within COASY have been associated with neurodegeneration due to iron accumulation and a SNP within exon 4 (rs598126) identified as a risk factor of EOAD in females with down syndrome [62, 63]. The variant in DGR343 is located within the PFAM dephosphorylation-CoA kinase domain and impact the functional activity of the enzyme. COASY has been reported to code for three transcript variants which result in tissue-specific isoforms (the alpha enzyme is ubiquitously expressed and the beta transcript is primarily in the brain in humans) making elucidating the functional consequence of variants within the gene difficult, particularly in neurodegenerative disorders [64].

Conclusion

Overall, this study identified 29 genetic variants in NEMP genes which have previously been implicated as playing a role in CSVD-related diseases or phenotypes. Mitochondrial genome variant screening did not identify any known cause of disease in this cohort, including the m.3243A > GLeu variant causative of MELAS. Extraction of the mitochondrial genome from the Ion AmpliSeq WES was unsuccessful in obtaining sufficient coverage depth across the mitochondrial gnome; however, the MitoSeq was able to mitigate this. MitoSeq was successful in generating a high sequencing depth and allowing for high-quality investigation of mitochondrial variants detected, despite no causative variants being identified. Extraction of the NEMP genes from WES data was able to identify and discover variants which may cause dysfunction in mitochondria and that have previously been associated with CSVD and related neurodegenerative disorders. Most of these genes are associated with severe autosomal recessive conditions and it is unlikely that many are causative of CSVD in this cohort, though there is evidence that variants identified in POLG, MTO1, LONP1, NDUFAF6, NDUFB3, and TCIRG1 may be novel causes of CSVD or contributing to a neurodegenerative phenotype. The impact of these genes in CSVD and neurodegeneration should be further investigated.