Genotype–phenotype characterisation of long survivors with motor neuron disease in Scotland

Background We investigated the phenotypes and genotypes of a cohort of ‘long-surviving’ individuals with motor neuron disease (MND) to identify potential targets for prognostication. Methods Patients were recruited via the Clinical Audit Research and Evaluation for MND (CARE-MND) platform, which hosts the Scottish MND Register. Long survival was defined as > 8 years from diagnosis. 11 phenotypic variables were analysed. Whole genome sequencing (WGS) was performed and variants within 49 MND-associated genes examined. Each individual was screened for C9orf72 repeat expansions. Data from ancestry-matched Scottish populations (the Lothian Birth Cohorts) were used as controls. Results 58 long survivors were identified. Median survival from diagnosis was 15.5 years. Long survivors were significantly younger at onset and diagnosis than incident patients and had a significantly longer diagnostic delay. 42% had the MND subtype of primary lateral sclerosis (PLS). WGS was performed in 46 individuals: 14 (30.4%) had a potentially pathogenic variant. 4 carried the known SOD1 p.(Ile114Thr) variant. Significant variants in FIG4, hnRNPA2B1, SETX, SQSTM1, TAF15, and VAPB were detected. 2 individuals had a variant in the SPAST gene suggesting phenotypic overlap with hereditary spastic paraplegia (HSP). No long survivors had pathogenic C9orf72 repeat expansions. Conclusions Long survivors are characterised by younger age at onset, increased prevalence of PLS and longer diagnostic delay. Genetic analysis in this cohort has improved our understanding of the phenotypes associated with the SOD1 variant p.(Ile114Thr). Our findings confirm that pathogenic expansion of C9orf72 is likely a poor prognostic marker. Genetic screening using targeted MND and/or HSP panels should be considered in those with long survival, or early-onset slowly progressive disease, to improve diagnostic accuracy and aid prognostication. Supplementary Information The online version contains supplementary material available at 10.1007/s00415-022-11505-0.


Introduction
The Scottish Motor Neuron Disease Register (SMNDR) (relaunched as the Clinical Audit Research and Evaluation for MND (CARE-MND) platform in 2015) has been collecting data regarding people living with MND (amyotrophic lateral sclerosis (ALS) and MND subtypes [primary lateral sclerosis (PLS), primary muscular atrophy (PMA) and progressive bulbar palsy (PBP)] in Scotland since 1989 [1,2]. The unique longevity of the register and the united efforts of the CARE-MND Consortium have provided extensive insight into the phenotypic and genetic heterogeneity of the disease [1,3,4]. Recent analysis of a historical cohort of 428 Scottish people with MND indicated a median survival of 3.5 years from onset of symptoms and 2 years from diagnosis [5]. However, the upper range of survival was 25.8 years from diagnosis (Fig. 1).
There is currently no definition of long survival with MND, with literature ranging from 5 to 10 years [6][7][8][9]. Previous European estimates of patients surviving more than 10 years from diagnosis are 11.8% [8]. The phenotype is variable: some long-surviving people with MND have typical features of ALS [6,7], while others have lower frequency of bulbar symptoms [2,10] or specific disease subtypes such as PLS [11]. Individuals with PLS are thought to have younger onset disease [12]. Prediction of long survival is otherwise challenging; this prognostic uncertainty compounded by a typically protracted time to reach a diagnosis is psychologically difficult for people with MND and their families [12].
Some SOD1 variants and a specific variant in the UNC13A gene (rs10419420) are thought to impart long survival [13,14]. People with apparent PLS have been found to have genetic variants not typically associated with MND, such as those normally associated with hereditary spastic paraplegia or Parkinson's disease [11,12,15]. Otherwise, 'long survivors' have not been extensively genotypically characterised.
Harnessing 30 years of data from the Scottish MND Register/CARE-MND platform, we aimed to study longsurvivors phenotypically and genetically. Whilst the overall proportion of long-survivors in MND populations is low, improved phenotypic understanding of this group is important to improve early and accurate diagnosis. Characterisation of long survivors might help to counsel patients and prognosticate at diagnosis. Long survivors also comprise a significant proportion of the prevalent clinical and 'research-ready' population and so knowledge of disease features may minimise bias and optimise generalisability of clinical trials [8]. DNA sequencing may provide clues towards genotype-phenotype associations, protective genetic factors or phenotypic overlap with disorders such as hereditary spastic paraplegia (HSP) [16]. We therefore aimed to identity phenotypic markers or genetic variants that might distinguish long survivors from a typical incident population of people with MND in Scotland, to guide clinical management and provide recommendations for genetic testing.

Patient details
Long survivors were defined as people with MND with survival from diagnosis beyond the 80th percentile of the historical 1989-2014 Scottish genotype study cohort (> 8 years) [5]. This cut-off was determined as a conservative measure of long survival based on previous literature [6][7][8]. Survival was calculated from date of symptom onset/diagnosis until death or censorship date (3rd August 2021). The following phenotypic data fields were available for analysis: Sex, Age of Onset, Age of Diagnosis, Time to Diagnosis, Survival from Onset, Site of Onset, El Escorial Classification, Family History of MND, Feeding Tube Insertion, Non-invasive Ventilation (NIV) Use, Riluzole Use. Rate of change of the ALS-Functional Rating Scale (ALS-FRS) was not thought to be informative as it is considered inadequate for upper motor neuron predominant MND such as PLS [18]. Other predictors, such as cognitive assessments, were only available for a small number of long surviving patients and so could not be included in statistical analyses.

Genetic analysis
Patient DNA samples were analysed using whole genome sequencing technology, performed as part of the Scottish Genomes Partnership (SGP) study [19]. Samples were sequenced to 30X coverage using TruSeq Nano library preparation kits and a HiSeq X sequencing platform (Illumina). FASTQ files were aligned to the human genome build GRCh37 using bwa mem (0.7.13) [20]. Post-processing was performed with samblaster (0.1.22) [21] to mark duplicate reads, and the Genome Analysis ToolKit (GATK, v3.4-0-g7e26428) [22] for indel realignment and base recalibration. Genotype likelihoods for each sample were calculated using the GATK HaplotypeCaller and resulting GVCF files were called jointly using GATK's GenotypeGVCFs function. Variant quality score recalibration (VQSR) was performed as per GATK best-practices [23] and a truth sensitivity threshold of 99.9% applied. Variants were filtered to include only those present in 49 MND-associated/MND-mimic genes (Fig. 2). Filtered variants were annotated and population frequency filters were applied using VarSeq Golden Helix software [24] to include only those with minor allele frequency (MAF) ≤ 0.01 in gnomAD 2.0.1v3 [25]. Variants were annotated using multiple in silico prediction algorithms from the Database for Nonsynonymous SNPs and their Functional Predictions (dbNSFP) [26] and included: SIFT, PolyPhen2 HDIV and HVAR, Mutation Taster, Mutation Assessor, FATHMM, PROVEAN, GERP and PhastCons. Measures of impact on splice site included scores derived from adaptive boost (Ada) and random forest (RF) models [27]. Samples were also tested for C9orf72 hexanucleotide repeat expansions using repeat prime PCR methodology; ≥ 30 repeats was considered pathogenic [28].
Variants were classified using the American College of Medical Genetics and Association for Molecular Pathology (ACMG-AMP) framework and adhering to the Association for Clinical Genomic Science (ACGS) UK 2020 guidelines [29,30]. Co-segregation was determined using methods described by Jarvik et al. [31] DNA samples from the Lothian Birth Cohorts (n = 1385) were used as ancestrymatched controls to identify variants enriched in cases versus controls as per ACMG-AMP guidelines. The gnomAD database was used as a population control data set [25]. Variants were considered significantly more prevalent in cases versus controls if odds ratio > 5.0 and confidence intervals did not cross 1.0 (ACMG-AMP criteria PS4) [29]. Additionally, gnomAD was used as a population control data set (ACMG criteria PM2) [25]. Variants meeting criteria thresholds for a pathogenic or likely pathogenic classification were reported. Variants of uncertain clinical significance (VUS) that fulfilled some criteria for being pathogenic, without reaching strict ACMG-AMP thresholds for significance, were considered under the Bayes rules outlined by Tavtigian et al. to determine a posterior probability of their being potentially pathogenic (probability > 0.5) [32].

Statistical analyses
Long survivors were compared with the incident population cohort of people with MND in Scotland diagnosed in 2015-17 (n = 437) using univariate statistics (Fisher's exact tests, t-tests and Wilcoxon rank-sum tests). Correction for multiple testing was undertaken using the Bonferroni method. R statistical programming was used [33].

Phenotypes
Fifty eight long survivors were identified, representing 3.3% of the total number of individuals on the CARE-MND platform at the time of data acquisition (n = 1779). Median survival from onset was 18.3 years (IQR 14.6-22.4) (missing dates for four individuals); from diagnosis 15.5 years (IQR 12.0-19.0). Thirty seven (63.8%) people were alive at censorship date, giving a point prevalence of 8.9% (Scottish prevalence of MND = 415). Fifty-four (90%) had consented to research and were characterised phenotypically. Percentage completed data per variable ranged from 94.4 to 100%.  Table 1. Male-tofemale ratio was 2:1. Twenty-nine (54.7%) of long survivors had ALS, 22 (41.5%) PLS, one individual presented with Monomelic Amyotrophy (Other) and one with Progressive Bulbar Palsy (PBP). The most common site of onset of disease was lower limbs (55.8%). Six (11.5%) had a family history of MND, although two individuals were related. Of the 13 (25%) on ventilation, 12 of these were non-invasive and one invasive ventilation.
Bonferroni corrected threshold for significance was 0.0045. Long survivors were significantly younger at onset (47.4 years) and diagnosis (51.5 years) compared with incident people with MND (p < 0.0001). Time to diagnosis was significantly more prolonged (26 months versus 12 months; p < 0.0001). Classification of disease was significantly different (p < 0.0001), with long survivors more likely to have PLS than incident patients (41.5% of long survivors). Although long survivors were more likely to have lower limb onset disease, this did not reach statistical significance. Long survivors were more likely to be prescribed riluzole at any point (62.7%; p = 0.0005) but were significantly less likely to undergo gastrostomy insertion (only 11.8%; p = 0.0003) ( Table 1).

Genotype
Forty six (79.3%) of all long survivors donated a DNA sample. Forty six unique variants (94 variant calls including variants found in multiple individuals) met filtering criteria (Supplementary Table 1). All variants were detected in a heterozygous state. Three were considered likely benign variants (6.5%). Sixteen variants fulfilled ACMG-AMP/ACGS criteria for being likely pathogenic (34.8%). However, three variants were in genes normally causing disease in an autosomal recessive pattern (ALS2 and SPG11) and three variants were novel frameshift/indel variants in a likely unstable region of a tenuously associated MND gene (TAF15) (Supplementary Table 1). Excluding these variants, 10 were classified as likely pathogenic (21.7%) (  (Table 2) were observed in the Scottish population previously [5]. Variants of interest were also identified in FIG4, hnRNPA2B1, SETX, SPAST and VAPB genes ( Table 2). Two related individuals had both the SOD1 p.(Ile114Thr) mutation plus a frameshift mutation (p.Asp449Thrfs*28) in the TAF15 gene [34]. The TAF15 mutation was found in two unrelated samples and nine controls, giving an odds ratio of 6.9 (95% CI 1.5-33.1). However, other variants were also observed at the same site as this loss-of-function variant suggesting possible instability of the region (Supplementary Table 1). Mutations in TAF15 are postulated to be linked with MND pathogenesis due to their shared role in FET (FUS, EWSR1, TAF15) protein pathways but are not clearly associated with MND and so the significance of this variant is attenuated [34].
No patients in this cohort had expansions in the C9orf72 hexanucleotide sequence within the pathogenic range.

Genotype-phenotype associations
Four individuals had the SOD1 p.(Ile114Thr) Scottish founder variant [3,5], ( Table 3) although two of these were first cousins once removed and had another family member with MND suggesting moderate co-segregation of this variant as per calculations outlined by Jarvik et al. (Table 4) [31]. All four patients had familial lower limb onset ALS with a median age of onset of 31.5 years and all required NIV but not gastrostomy. Case note review did not reveal significant cognitive impairment. Another individual had the SOD1 p.(Gly38Arg) variant; this variant is associated with a mouse model of ALS which recapitulates some of the motor features [35]. However, there is a paucity of knowledge regarding its phenotypic correlates. This individual had a family history of MND-his father developed symptoms of MND in his 30s and died in his 60s, implying a similar course of disease.
Two patients had missense mutations in FIG4 and SETX respectively, the latter in a patient with PLS. These genes have been associated rarely with both ALS and PLS [15]. Two patients had SQSTM1 variants; these have been Table 2 Significant variants identified in long survivors with motor neuron disease (MND) Classification of pathogenicity based on use of American College of Medical Genetics and Association for Molecular Pathology (ACMG-AMP) and Association for Clinical Genomic Science (ACGS) UK 2020 guidelines. All variants detected in heterozygous state. Genomic controls comprised ancestry-matched individuals from the Lothian Birth Cohorts (n = 1385). Variants were considered significantly more prevalent in cases versus controls if odds ratio > 5.0 and confidence intervals did not cross 1.0 (ACMG-AMP criteria PS4). Additionally, gnomAD was used as a population control data set (ACMG criteria PM2). Co-segregation was quantified using methods described by Jarvik    are also associated with multisystem disease [36]. One individual had a variant in hnRNPA2B1; although this variant was in the 5'UTR (untranslated region) of this gene, the region is highly conserved and the variant was absent from controls with a low MAF in population databases. Variants in hnRNPA2B1 are associated with inclusion body myopathy with early-onset Paget disease with or without frontotemporal dementia, but have been described in ALS [37,38]. On case note review, neither patient with the hnRNPA2B1 variant or the SQSTM1 variant had evidence of multisystem disease typical of the genes (inclusion body myopathy or Paget's disease). However, the patient with the hnRNPA2B1 variant had progressive cognitive impairment/dementia but was determined to have a PLS phenotype. The two patients with SQSTM1 mutations had limb onset ALS, requiring both NIV and gastrostomy. One of these patients was also a heterozygous carrier of a loss of function variant in the SPG11 gene (p.M245Vfs*2) (Supplementary Table). This SPG11 variant has previously been reported in ClinVar [39] in a biallelic state associated with HSP and juvenile ALS but has not previously been observed in trans with another variant. The patient, however, had generalised signs and symptoms in-keeping with a classical ALS phenotype, with supportive electromyography (EMG) and gastrostomy and NIV requirements. This patient also underwent post-mortem examination and had typical ALS-associated TDP-43 intracytoplasmic aggregates [Dr Jenna Gregory et al., unpublished]. Two patients carried a SPAST intronic variant; this variant was absent from controls and gnomAD but has been reported in Clinvar associated with autosomal dominantly inherited HSP (SPG4) and had supportive in silico predictors of an effect on splicing [40]. Interestingly, the two patients harbouring this variant had limb onset ALS and PLS and did not require NIV or gastrostomy suggesting perhaps misdiagnosis of MND or an overlapping phenotypic spectrum. A further splice site mutation was observed in the VAPB gene in an individual with PLS. VAPB variants are rare in MND and the gene is small and tolerant to change. However, all splice algorithms predicted a significant reduction in splice efficiency and the variant was absent in controls with a low MAF.

Discussion
The prevalence of long survivors in the Scottish population (8.9%) is comparable with previous European estimates [8]. Patients in this cohort are younger at onset and diagnosis and have a longer time to diagnosis. We suspect that the diagnostic delay reflects uncertainty based on gradual evolution of clinical features and absence of lower motor neuron features on EMG. These findings agree with previously published observations [7-9, 12, 41]. However, this delay presents a window for improvement in MND care.
Our cohort is enriched for people diagnosed with upper motor neuron predominant MND (PLS). Although not statistically significant compared with incident patients, the male-to-female ratio in the long-surviving group was high (2:1) and this may be due to the contribution of PLS cases which are known to be largely male (ratio 2-4:1) [12]. That PLS and ALS are on a continuum of disease has long been debated; however, recent systematic reviews found no consistent distinguishing imaging or pathological biomarkers [12,18].
Long survivors are more likely to start on riluzole than incident patients. This may be related to opportunity in the context of long duration of disease. While riluzole has been trialled predominantly in people with ALS, an appeal was made to the National Institute for Clinical Excellence (NICE) in 2001 to allow its use in other forms of MND [42]. Clearly, in current practice, patients with PLS do receive riluzole. Long survivors are significantly less likely to have a gastrostomy inserted, in spite of there being more opportunity, temporally, for this to occur. However, the proportions of bulbar-onset patients in incident and long surviving cohorts are comparable (28.5% and 26.9% respectively). It is possible that upper motor neuron pseudobulbar symptoms (for example, dysarthria) in the long-surviving PLS cohort are being confused for true bulbar symptoms. It may be clinically difficult to localise dysarthria clinically, especially early in disease. However, dysphagia is rare in PLS [43]. Indeed, only two (14.3%) patients with bulbar-onset disease had gastrostomy insertion; other patients were noted to have dysarthria with preserved swallow. Although we might expect respiratory muscle weakness to be associated with poor outcome, a surprisingly high proportion of long survivors required NIV (25.0% in the long survivors, 27.2% in the incident cohort). It was unclear if this requirement occurred late in disease or if NIV contributed to survival. Long surviving people with MND form a significant proportion of the prevalent population; exclusion of such participants from MND clinical trials risks discrimination and attenuation of population for recruitment and study power. Our data suggest that detailed characterisation of such participants (by classification and by distinguishing bulbar from pseudobulbar symptoms) would aid trial inclusion, generalisability and interpretation. Only 11.5% of the long surviving cohort had a family history of MND whereas almost a third (30.4%) of the genotyped cohort had a ACMG-AMP likely pathogenic variant. As would be expected, this is higher than when a limited six-gene panel was employed for an unselected cohort of MND patients in Scotland (17%) [5]. Genetic analysis in long survivors broadened the already wide phenotypic spectrum of disease of patients with the known pathogenic SOD1 p.(Ile114Thr) variant which is frequently observed in the Scottish population [5]. Families with this variant in Scotland may be encouraged that it can be associated with long survival. As previously described, people with this variant have an otherwise homogeneous phenotype with limb-onset ALS, preserved bulbar function and low gastrostomy uptake and absence of cognitive impairment.
The presence of other rare variants in MND-associated genes (FIG4, hnRNPA2B1, SETX, SQSTM1, TAF15, VAPB) in patients with both ALS and PLS without a family history confirms the sporadic nature of variants in these genes in the Scottish population. Although these variants met ACMG-AMP/ACGS classification criteria for being likely pathogenic, it is not possible to determine if they are disease-causing in a rare heterogeneous condition such as MND. We raise the possibility that people with MND carrying mutations in the SPAST gene may have an MND syndrome similar to HSP. In general, HSP will normally present earlier than upper motor neuron predominant ALS/PLS and patients are more likely to have lower limb onset disease and a family history; however, there will be exceptions and overlap [12,16]. Extended and explicit past medical history and family history questioning (including inclusion body myopathy, Paget's disease, FTD, HSP and other neurological causes of spasticity) might have illuminated patterns in-keeping with the identified genetic variants.
We have also highlighted VUS of potential interest which meet some pathogenic criteria without fulfilling strict ACMG-AMP requirements for pathogenicity; these require future study and reassessment.
The absence of C9orf72 pathogenic expansions in this study is supportive of this variant typically being a poor prognostic marker [44]. Indeed, no long survivors were diagnosed with MND-FTD, compared with 5.7% of the incident cohort. Formal systematic cognitive testing was not undertaken for the majority of long survivors, however, as this assessment tool was only routinely applied within the last decade.

Limitations
Our study would benefit from inclusion of cognitive profiling. Details regarding time to intervention (gastrostomy, NIV) were not available for all patients but future analysis of long survivors might study the temporal relationship of these measures on outcome. Our control population is ancestry but not age or sex-matched and so may not reflect the characteristics of our typically young MND long survivors.
We have reported variants as per ACMG-AMP guidelines. MND-associated variants are rare and have variable penetrance and so often struggle to fulfil strict ACMG-AMP criteria. Many likely pathogenic variants were so defined due to their absence in controls and absence in gnomAD alone. However, these guidelines provide a necessary structure for variant assessment and allow us to report variants which might as yet fail to achieve clinical significance but might be of research interest. As this was a research project and results were not relayed to patients or families, frameshift and indel variants were not confirmed by Sanger sequencing and so it is not possible to determine if they are truly disruptive. Our 49-gene panel was comprehensive for MND-spectrum genes but we did not examine the androgen receptor CAG trinucleotide repeat associated with Kennedy disease [7], nor did we exhaust all rare HSP and PLS-associated genes [12]. Pathology correlation was only available for one patient in this cohort. Although CARE-MND recruitment methods are stringent [1], it is possible that some patients are misdiagnosed and pathological examination would provide confirmation of disease in rare gene and MND-mimic gene carriers [18].

Implications and conclusions
With the benefit of three decades of longitudinal data collection through the Scottish CARE-MND database, we have shown that long surviving people with MND can be characterised by younger age at onset and diagnosis, increased incidence of PLS and, crucially, a longer time to diagnosis (median 2.2 years, upper range 15 years). This long period of diagnostic uncertainty for a typically young person is a key target for improvement in care; such patients may otherwise be denied access to designated benefits and specialised support. Early clinical genotyping of such individuals may help to provide reassurance -for example, the absence of a C9orf72 pathogenic expansion and presence of the SOD1 p.(Ile114Thr) mutation or other rare mutation might indicate better outcome. Additionally, presence of the SOD1 p.(Ile114Thr) variant might imply long preservation of bulbar function and cognitive function but allow individuals to prepare for NIV requirement. Extended family history, including multisystem disease and neurological syndromes featuring UMN signs/spasticity, might be of particular merit in those with young onset, slowly progressive ALS or PLS; in these cases, MND and HSP genetic testing may be appropriate. Long survivors are more likely to have a likely pathogenic variant than previous estimates of gene carrier status in MND populations; however, this may reflect breadth of our gene panel. The majority of MND patients do not undergo routine genetic testing due to variable variant penetrance, challenges with variant classification and lack of relevant treatment options for people with MND. However, recent evidence suggests that limiting genetic testing to people with presumed prior probability of having a pathogenic mutation (i.e. those with a family history and youngonset disease) may fail to capture a significant proportion of potential actionable or informative genetic mutations [45]. Our study demonstrates that slowly progressive people with MND may particularly benefit from genetic input, aiming for avoidance of prolonged and unnecessary investigation, earlier diagnosis and access to disease-specific services and care.
Funding DL received funding for PhD study at the inception of this study from the Chief Scientist Office for Scotland, the Motor Neuron Disease Association and Motor Neuron Disease Scotland (CAF/ MND/15/01). This work is also supported by the UK Dementia Research Institute which receives its funding from UK DRI Ltd, funded by the UK Medical Research Council, Alzheimer's Society and Alzheimer's Research UK.
Data Availability Data supporting the genetic findings of this study are available within the article and supplementary material. Raw CARE-MND data are not available due to their containing information that could compromise the privacy of research participants. Further information about the CARE-MND database can be found at: https:// www. carem nd. org. uk/.

Conflicts of interest
The authors report no competing interests. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.