Abstract
A biallelic (AAGGG) expansion in the poly(A) tail of an AluSx3 transposable element within the gene RFC1 is a frequent cause of cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS), and more recently, has been reported as a rare cause of Parkinson’s disease (PD) in the Finnish population. Here, we investigate the prevalence of RFC1 (AAGGG) expansions in PD patients of non-Finnish European ancestry in 1609 individuals from the Parkinson’s Progression Markers Initiative study. We identified four PD patients carrying the biallelic RFC1 (AAGGG) expansion and did not identify any carriers in controls.
Similar content being viewed by others
A biallelic pentanucleotide repeat expansion (AAGGG) in the poly(A) tail of an AluSx3 transposable element in the replication factor C subunit 1 (RFC1) gene is a frequent cause of cerebellar ataxia, neuropathy, and vestibular areflexia syndrome (CANVAS)1. Further, the length of the biallelic “AAGGG” expansion is disease-modifying, as an inverse correlation was observed between the size of expansions and age at neurological onset, age at onset of dysarthria and/or dysphagia, and age at the use of one stick2.
More recent genetic studies have broadened the phenotypic spectrum of RFC1 expansions. Several groups have investigated the prevalence of RFC1 expansions in Parkinsonian disorders including multiple system atrophy with conflicting findings3,4. In terms of its association with Parkinson’s disease (PD) specifically, Kyotovuori et al. identified that three out of 569 patients with PD were carriers for the biallelic RFC1 (AAGGG) expansion, suggesting that this expansion may be a rare cause of PD in the Finnish population5.
In this study, we aimed to profile the biallelic RFC1 “AAGGG” repeat expansion in PD patients from non-Finnish European ancestry in 903 cases and 706 controls from the PPMI cohort. Due to the complexity of the RFC1 repeat, short-read whole genome sequencing (WGS) data can yield false positives, hence experimental validation is required. From the short-read analysis, five patients were predicted to carry the biallelic expansion. However, through the Oxford Nanopore Technologies (ONT) long-read DNA WGS, one predicted carrier was identified as a false positive, leaving four validated carriers. The four remaining carriers were PD patients resulting in an estimated frequency of 0.43% in PD. No controls carried the “AAGGG” RFC1 repeat expansion. From the ONT long-read analysis, the biallelic “AAGGG” expansion repeat units varied from 333 to 1183 in the four carriers, which is slightly larger than the 144–820 reported in PD patients in the Finnish population5, but a smaller range than what was observed in CANVAS patients from European ancestry, which ranged from 400 to 2000 repeats1 (Supplementary Table 2).
For the four “AAGGG” RFC1 carriers, some variation was observed in the clinical phenotypic description (Table 1). However, overall in agreement with previous observations of the repeat expansion in PD patients, the clinical phenotype was that neither the presentation nor disease course differed from those in other PD patients. Patient 1 developed PD at the age of 57. She presented tremors, rigidity, and bradykinesia as motor symptoms (MDS-UPDRS 24 pts, Hoen and Yahr (H&Y) stage 2), depression, mild cognitive decline, constipation, and insomnia as non-motor symptoms. Her symptoms did not show much progression until the latest PPMI visit (one and a half years after onset) since she had not taken any medications. Her dopamine transporter (DaT) imaging was normal at the initial diagnosis and she showed a negative reaction in alpha-synuclein (aSyn) SAA.
Patient 2 developed PD at the age of 65. She presented tremors, rigidity, bradykinesia, and postural instability at the diagnosis (H&Y stage 1, MDS-UPDRS part 3, 18 points). Approximately 2 years after the onset, her symptoms progressed (H&Y stage 3, MDS-UPDRS part 3, 35 points) with 900 mg of levodopa equivalent dose (LEDD) and she showed a negative reaction in aSyn SAA.
Patient 3 developed PD at the age of 54, presenting with tremors, bradykinesia, and hyposmia. At the age of 61, 9 years from the onset, her H&Y stage was 2 with 900 mg of LEDD. She showed a positive reaction in aSyn SAA. DaT imaging showed decreased binding in the putamen.
Patient 4 developed PD at the age of 76. A year after the onset, his H&Y stage was 2, MDS-UPDRS part 3 was 16, accompanied by constipation and insomnia. Clinical data of follow-up visits were not available and aSyn SAA was not performed for this patient. Genetic testing revealed that he was a carrier of the known damaging LRRK2 p.G2019S variant. DaT imaging results were not available.
In this study, we leveraged short-read WGS data from the PPMI cohort and the computational tool str-analysis to genetically screen 1609 individuals for the biallelic “AAGGG” repeat expansion and identified four PD patient carriers and no control carriers giving a frequency of 0.44% in PD. To note, when we excluded carriers of known pathogenic variants in LRRK2, GBA1, and SNCA, and those with scans without evidence of dopaminergic deficits (SWEDD), the estimated frequency in PD is higher (0.84%), which is slightly higher than what was previously reported in the Finnish population, who report a frequency of 0.53% in PD patients5.
Interestingly, the reported clinical phenotype of these patients is in line with typical PD symptoms and no clear red flags in the clinical data were observed that the diagnosis was incorrect. However, it is worth noting that no specific ataxia phenotype data is collected and therefore we cannot exclude misdiagnosis. Actually, only one out of three patients that had data available showed a positive reaction in aSyn SAA. Notably, SAA positivity is generally very high in PPMI PD cases and is influenced by genetic status. 67.5% of LRRK2 cases are SAA positive, whereas typical non-LRRK2 cases show a remarkably high SAA positivity rate of 93.3%.
As demonstrated in this study, long-read DNA sequencing is a powerful tool and a required step to validate potential pathogenic repeat expansion carriers. Short-read sequencing methods are notorious for over or underestimating repeat expansion lengths and the RFC1 locus is further complicated by its variable motif sequence. Therefore, although the allele frequency reported in this present study is inline if not slightly higher than what was identified in the Finnish study using PCRs for large (XL-PCR) amplicons and repeat primed PCR (see ref. 5), given the limitations, short-read sequencing can still lead to false negatives. As such, generating population-scale long-read DNA sequencing datasets to capture repeat expansions that are currently hidden using traditional methods is an essential step towards solving the architecture of complex genetic disorders6. For PD specifically, the Global Parkinson’s Genetics Program (GP2 www.gp2.org) is leading a large-scale initiative to long-read DNA sequence ~1000 case-control samples (Fig. 1)
Methods
Cohort information
Samples were obtained from the Parkinson’s Progression Markers Initiative (PPMI; https://www.ppmi-info.org/). Clinical and demographic characteristics of the PPMI cohort are shown in (Supplementary Table 1). Participants included PD cases clinically diagnosed by experienced neurologists and control individuals. All PD cases met the criteria defined by the UK PD Society Brain Bank7. All individuals were of European descent and were not age or gender-matched. This included a total of 903 cases and 706 neurologically healthy controls. PD cases ranged from 33 years to 90 years of age at diagnosis (mean 61.7 ± 11.05, median 63.0) and included 62 individuals who showed SWEDD and 368 individuals who carry known genetic mutations associated with PD (within LRRK2, GBA1, and SNCA). Control subjects ranged from 19 years to 86 years of age (mean 58.3 ± 11.54, median 60.0) and included 503 individuals who carry known genetic mutations associated with PD.x.
Short-read analysis
Short-read whole-genome sequencing data in bam format was downloaded through AMP-PD and has been reported in detail previously by Iwaki et al. 8. For short-read data analysis, alignment was performed based on the GATK best practice pipeline, and the fastqs were aligned to the hg38 reference genome using BWA-mem. The STR detection tool str-analysis was used to screen for biallelic RFC1 (AAGGG) expansion carriers (https://github.com/broadinstitute/str-analysis).
Long-read validation of expansion in carriers
To validate the five individuals predicted to carry pathogenic RFC1 “AAGGG” biallelic repeat expansions, ONT whole-genome long-read DNA sequencing was performed. For all predicted carriers, a library was prepared from the DNA of the individuals with either the SQK-LSK1109 or the SQK-LSK114 ligation sequencing kit from ONT10. The samples were quantified using a Qubit fluorometer and were loaded onto a PromethION R.9.4.1 (SQK-LSK110) or R.10.4 flow cell (SQK-LSK114) following ONT standard operating procedures and ran for a total of 72 h on a PromethION device (Supplementary Table 2).
Fast5 files containing the raw signal data were obtained from sequencing performed using MinKNOW (ONT). All fast5 files were used to perform super accuracy base calling on each sample with Guppy v6.0.1 (R.9) (ONT) or Dorado (v0.5.0). and sequencing statistics were obtained with seqkit v2.2.0 using fastq files that passed quality control filters in the super accuracy base calling. To accurately determine the length of RFC1 repeat expansion from the ONT data, as required by tandem-genotypes, the fastqs were first mapped to the hg38 reference using LAST as described in detail here (https://github.com/mcfrith/last-rna/blob/master/last-long-reads.md)11. To size the expansion on each allele, tandem genotypes were then run using the mapped files12.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
Data used in the preparation of this article were obtained from the PPMI database (www.ppmi-info.org/access-data-specimens/download-data), RRID: SCR_006431. For up-to-date information on the study, visit www.ppmi-info.org. The PPMI cohort and the ONT raw data will be available at the LONI IDA.
References
Cortese, A. et al. Biallelic expansion of an intronic repeat in RFC1 is a common cause of late-onset ataxia. Nat. Genet. 51, 649–658 (2019).
Cortese, A. et al. Repeat expansion size predicts age of onset in RFC1 CANVAS and disease spectrum (S29.005). Neurology 98, (2022).
Wan, L. et al. Biallelic intronic AAGGG expansion of RFC1 is related to multiple system atrophy. Ann. Neurol. 88, (2020).
Sullivan, R. et al. Letter: RFC1-related ataxia is a mimic of early multiple system atrophy. J. Neurol. Neurosurg. Psychiatry 92, 444 (2021).
Kytövuori, L. et al. Biallelic expansion in RFC1 as a rare cause of Parkinson’s disease. NPJ Parkinsons Dis. 8, 6 (2022).
De Coster, W., Weissensteiner, M. H. & Sedlazeck, F. J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 22, 572–587 (2021).
Gelb, D. J., Oliver, E. & Gilman, S. Diagnostic Criteria for Parkinson Disease. Archives Neurol 56, 33 Preprint at https://doi.org/10.1001/archneur.56.1.33 (1999).
Iwaki, H. et al. Accelerating medicines partnership: Parkinson’s disease. Genetic resource. Mov. Disord. 36, 1795–1804 (2021).
J Billingsley, K. Processing frozen human blood samples for population-scale Oxford nanopore long-read DNA sequencing SOP v1. https://doi.org/10.17504/protocols.io.ewov1n93ygr2/v1 (2022).
Miano-Burkhardt, A. Processing frozen human blood samples for population-scale SQK-LSK114 Oxford nanopore long-read DNA sequencingSOP v1. https://doi.org/10.17504/protocols.io.x54v9py8qg3e/v1 (2023).
Kiełbasa, S. M., Wan, R., Sato, K., Horton, P. & Frith, M. C. Adaptive seeds tame genomic sequence comparison. Genome Res. 21, 487–493 (2011).
Mitsuhashi, S. et al. Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads. Genome Biol. 20, 58 (2019).
Acknowledgements
We would like to thank all of the participants who donated their time and biological samples to be a part of this study. This work was supported in part by the Intramural Research Programs of the National Institute on Aging (NIA) and the National Institute of Neurological Disorders and Stroke (NINDS), part of the National Institutes of Health, Department of Health and Human Services; project numbers AG000542, Z01-AG000949, 1ZIANS003154. This work utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov). Short-read WGS data used in the preparation of this article was obtained from the Accelerating Medicine Partnership® (AMP®) Parkinson’s Disease (AMP PD) knowledge platform. For up-to-date information on the study, visit https://www.amp-pd.org. The AMP® PD program is a public–private partnership managed by the Foundation for the National Institutes of Health and funded by the NINDS in partnership with the Aligning Science Across Parkinson’s (ASAP) initiative; Celgene Corporation, a subsidiary of Bristol-Myers Squibb Company; GlaxoSmithKline plc (GSK); The Michael J. Fox Foundation for Parkinson’s Research; Pfizer Inc.; Sanofi US Services Inc.; and Verily Life Sciences. ACCELERATING MEDICINES PARTNERSHIP and AMP are registered service marks of the U.S. Department of Health and Human Services. Clinical data and biosamples used in the preparation of this article were obtained from the MJFF Parkinson’s Progression Marker Initiative (PPMI). PPMI—a public–private partnership—is funded by the Michael J. Fox Foundation for Parkinson’s Research and funding partners, including 4D Pharma, Abbvie, AcureX, Allergan, Amathus Therapeutics, ASAP, AskBio, Avid Radiopharmaceuticals, BIAL, Biogen, Biohaven, BioLegend, BlueRock Therapeutics, Bristol-Myers Squibb, Calico Labs, Celgene, Cerevel Therapeutics, Coave Therapeutics, DaCapo Brainscience, Denali, Edmond J. Safra Foundation, Eli Lilly, Gain Therapeutics, GE HealthCare, Genentech, GSK, Golub Capital, Handl Therapeutics, Insitro, Janssen Neuroscience, Lundbeck, Merck, Meso Scale Discovery, Mission Therapeutics, Neurocrine Biosciences, Pfizer, Piramal, Prevail Therapeutics, Roche, Sanofi, Servier, Sun Pharma Advanced Research Company, Takeda, Teva, UCB, Vanqua Bio, Verily, Voyager Therapeutics, the Weston Family Foundation, and Yumanity Therapeutics. The PPMI Investigators have not participated in reviewing the data analysis or content of the manuscript. For up-to-date information on the study, visit www.ppmi-info.org. We would also like to thank the team at PPMI for sending frozen blood samples to complete the long-read DNA validation, specifically; Tatiana M. Foroud, Jan E. Hamer, Caitlin D. Schulz, Bradford Casey, and Mark Frasier. We would also like to thank Ben Weisburd for his guidance with the str-analysis tool. K. Daida was supported by the JSPS research fellowship for Japanese biomedical and behavioral researchers at NIH.
Funding
Open access funding provided by the National Institutes of Health.
Author information
Authors and Affiliations
Contributions
P.A.J., R.S., J.V., H.H., C.B., A.B.S., J.H., and K.J.B. designed, executed, reviewed, and critiqued the study. P.A.J., K.D., A.M.-B., L.M., J.D., J.R.B., A.M., M.A.N., R.K.K., F.J.S., B.C., and K.J.B. ran the analysis and generated the long-read data for validation. H.I., G.C., and M.B.M. reviewed and critiqued the analysis.
Corresponding author
Ethics declarations
Competing interests
ABS is an editor for npj Parkinson’s Disease. ABS was not involved in the journal’s review of, or decisions related to, this manuscript. All remaining authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Alvarez Jerez, P., Daida, K., Miano-Burkhardt, A. et al. Profiling complex repeat expansions in RFC1 in Parkinson’s disease. npj Parkinsons Dis. 10, 108 (2024). https://doi.org/10.1038/s41531-024-00723-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41531-024-00723-0
- Springer Nature Limited