Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders

Suroliya, Varun; Uppili, Bharathram; Kumar, Manish; Jha, Vineet; Srivastava, Achal K.; Faruq, Mohammed

doi:10.1038/s41439-024-00281-0

Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders

Article
Open access
Published: 13 June 2024

Volume 11, article number 25, (2024)
Cite this article

Download PDF

You have full access to this open access article

Human Genome Variation

Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders

Download PDF

Varun Suroliya¹,
Bharathram Uppili^2,3,
Manish Kumar^2,3,
Vineet Jha⁴,
Achal K. Srivastava¹ &
…
Mohammed Faruq ORCID: orcid.org/0000-0001-8278-8396²

429 Accesses
Explore all metrics

Abstract

Tandem nucleotide repeat (TNR) expansions, particularly the CNG nucleotide configuration, are associated with a variety of neurodegenerative disorders. In this study, we aimed to identify novel unstable CNG repeat loci associated with the neurogenetic disorder spinocerebellar ataxia (SCA). Using a computational approach, 15,069 CNG repeat loci in the coding and noncoding regions of the human genome were identified. Based on the feature selection criteria (repeat length >10 and functional location of repeats), we selected 52 repeats for further analysis and evaluated the repeat length variability in 100 control subjects. A subset of 19 CNG loci observed to be highly variable in control subjects was selected for subsequent analysis in 100 individuals with SCA. The genes with these highly variable repeats also exhibited higher gene expression levels in the brain according to the tissue expression dataset (GTEx). No pathogenic expansion events were identified in patient samples, which is a limitation given the size of the patient group examined; however, these loci contain potential risk alleles for expandability. Recent studies have implicated GLS, RAI1, GIPC1, MED15, EP400, MEF2A, and CNKSR2 in neurological diseases, with GLS, GIPC1, MED15, RAI1, and MEF2A sharing the same repeat loci reported in this study. This finding validates the approach of evaluating repeat loci in different populations and their possible implications for human pathologies.

Genome-wide detection of short tandem repeat expansions by long-read sequencing

Article Open access 28 December 2020

Abundancy of polymorphic CGG repeats in the human genome suggest a broad involvement in neurological disease

Article Open access 28 January 2021

C9orf72 intermediate expansions of 24–30 repeats are associated with ALS

Article Open access 17 July 2019

Introduction

Spinocerebellar ataxia (SCA) and other neuromuscular disorders are part of a group of neurodegenerative disorders that share a common disease mechanism of tandem nucleotide repeat expansions^1,2. The available literature and various databases have identified CNG nucleotide repeats as the most prevalent cause of these neurodegenerative diseases, such as spinocerebellar ataxia^3,4,5. Nearly 30-40% of ataxia cases can be explained by trinucleotide CNG repeat expansions^3,4,5. Thus, identifying whether CNG expansions in other genes are a causal mechanism for unexplained cases of ataxia and other neuromuscular diseases is imperative.

Earlier gene discovery efforts involving classical gene-mapping efforts identified these CNG expansions as causal events. However, rapidly identifying novel CNG expansions at the cohort level is difficult due to not only the rarity of the disease but also its clinical heterogeneity. Additionally, these genomic regions are considered dark regions, which are inaccessible using traditional methods that utilize short-read sequencing data. The high cost of long-read sequencing, along with time constraints, does not permit a wider scope for identification.

In 2004, Pandey et al. tested a different approach and computationally reviewed the CAG repeats in the entire genome, identifying two CAG loci as putative candidates for SCA⁶.

In this study, we used a combination of computational and genetic methods to identify possible disease-causing unstable repeat loci using a heuristic approach, which may serve as a cost-effective solution.

Methodology

Computational approach for TNR screening in the human genome

The FASTA sequences of individual chromosomes in the human reference genome (hg19) were downloaded from the UCSC genome browser (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/). The CNG repeat units associated with various neurological and neuromuscular disorders were selected from the literature for this analysis¹.

A program to find all possible repeats with a minimum of 4 continuous repetitive units was written in python.

The methodology involved parsing a single whole-genome FASTA file. Using regular expression, repetitive patterns of specified repeat units within each chromosome were identified. Subsequently, a systematic iteration was conducted for each chromosome to detect potential repeat segments, with their repeat complement, including the starting and ending positions from the beginning of the respective chromosome; the length of the repeat; and the repeat unit as the output (source codes are available at https://github.com/bharathramh/STR_repeat/blob/main/str.py). Then, the positions were functionally annotated using the ANNOVAR offline version. We first downloaded the hg19 databases from ANNOVAR, and using the table_annovar.pl command, we annotated the repeat regions.

Sample enrollment

A total of 100 patients with genetically uncharacterized SCA (retrospective + prospective cases) were enrolled. Patients exhibited autosomal dominant or X-linked inheritance and a sporadic late age of onset and were negative for SCA1, SCA2, SCA3, SCA6, SCA7, SCA8, SCA12, SCA17 and FRDA.

The mean age (SD) of the patients was 59.07 (8.12) (range, 42-84 years), and the mean age at disease onset was 55.54 (7.11) years (range, 42-70 years).

The control samples (N = 100) were obtained from the DNA repository of the Indian Genome Variation Consortium project⁷. We divided the analysis into two stages; the first stage focused on finding the unstable CNG sites in the genome, and the second stage investigated these unstable loci in patients with genetically uncharacterized SCA to identify disease-associated expansion-prone novel repeat loci.

Evaluation of repeat length variability at the selected loci

A total of 52 loci were targeted for CNG length estimation by polymerase chain reaction (PCR) using an M13-tagged forward primer, a reverse primer, and a fluorescently labeled M13-tagged primer.

For PCR amplification, the sample consisted of template human DNA (25 ng), PCR master mix (Epicentre’s FailSafe mix or Promega master mix), 0.1 µl of forward primer (10 pM/µl), 0.4 µl of reverse primer (10 pM/µl), and 0.4 µl of M13-tagged primer (10 pM/µl) in a reaction volume of 10 µl. The PCR conditions were 95 °C for 3 min; 35 cycles of denaturation at 95 °C for 30 sec, annealing at 60 °C for 30 sec, and extension at 72 °C for 1 min; and a final extension at 72 °C for 5 min. The samples were analyzed using a fragment analyzer and visualized with GeneMapper software (version 4, Applied Biosystems).

Results

Identification of CNG repeats from the human reference genome sequence

Through genome-wide CNG repeat selection, we found a total of 15,069 loci ( ≥ 4 contiguous repeats) (Fig. 1). The CNG repeats were abundant in the coding region and UTR. Overall, CAG and CTG repeats were most abundant across different genomic regions (Table 1). We annotated these repeat loci using ANNOVAR⁸ and further categorized the tandem repeats based on the length observed in the reference genome: Group 1, 4–6 repeats; Group 2, 7–9 repeats; and Group 3, >9 repeats (Table 1).

**Fig. 1: Outline of the study design.**

Table 1 Categorization of CNG repeat loci based on location and number of repeats in the reference genome.

Full size table

Using a reductionist approach for further analysis, we selected 52 loci located in CDS region or UTR with a length of contiguous repeats ≥10 (Table 2 and Fig. 2). Repeats with more than 10 units are more prone to expansion events⁹ and cause a decrease in the activity of flap endonuclease-1 (FEN1) on Okazaki fragments¹⁰. Furthermore, most pathogenic trinucleotide repeat expansions were observed in the coding region or UTR, for example, in SCA1-SCA3 (CAG expansion in the coding region), SCA12 (CAG expansion in the 5’ UTR), and myotonic dystrophy (CTG expansion in the 3’ UTR).

Table 2 List of 52 selected loci and their repeat status in control samples (unstable loci are marked in bold).

Full size table

Genotyping of 52 CNG repeats in a control Indian population

By assessing the length variability of the 52 loci in control samples, 33 loci were found to be relatively stable (length variability of 1–6 repeat units), and 19 loci were more polymorphic in nature (length variability of 7–23 repeat units). These 19 more variable repeat loci (RAI1, UMAD1, GLS, HTR7P1, CNKSR2, MAML3, MED15, MLLT3, USF3, MEF2A, MIR205HG, NCOR2, RPL14, JPH3, MAB21L1, ANKUB1, ERF, GIPC1, and EP400) were further screened in our ataxia patient cohort to identify any length variation that might be pathogenic (Fig. 3).

The MAB21L1, ANKUB1, and GLS genes were highly polymorphic and had a wide range of repeat distributions in the population [modes of repeats (ranges): 13 (8–26), 15 (8–33), and 12 (6–29), respectively]. The genes ANKUB1 and UMAD1 exhibited a large number of repeats ( > 30 repeats) in both the case and control groups. No significant difference in the large expansion range was observed between the case and control screenings (Table 2).

The heterozygosity indices (HIs, which measure the number of heterozygotes in the population) of UMAD1, MAB21L1, ANKUB1, GLS, and RPL14 were greater than 0.7 in both cases and controls. On the other hand, MLLT3 and CNKSR2 were less polymorphic and had more homozygous repeats (HI ≤ 0.1) in both groups. Most of the target loci fell within the range of 0.3 to 0.7, except for ERF, which had an HI of less than 0.25 in all samples.

Selection of unstable CNG repeats in the 1000 Genomes database

Since disease-associated tandem repeats tend to be more polymorphic in the general population, we investigated the polymorphic nature of these loci in the control population. Compared with different 1000 Genomes control populations, the mode of repeats and variability in the GLS gene were greater in the African and SAS populations (Table 3). MAB21L1 exhibited a greater repeat range in the EAS population. Although some of the other loci had a maximum of >20 repeat expansions, these loci were uniform or less variable within the populations. MEF2A was highly variable, ranging from 2 to 16 repeats, but it was uniform throughout the population. GIPC1 repeat variability was less common in the EUR population. For MED15 and ERF, repeat data were available for very few patient samples among different populations. We could not find any short tandem repeat data for the HTR7P1, RPL14, CNKSR2, or MLLT3 repeat loci. Our repeat data for the GLS, ANKUB1, EP400, JPH3, and RAI1 loci showed a biallelic distribution, which is also observed in other major populations.

Table 3 Features and characteristics of 19 polymorphic loci.

Full size table

Interestingly, we observed variability in the repeat ranges of USF3, MEF2A, JPH3, RAI1, ERF, MED15, MAML3, and UMAD1 compared to those of other world populations, but none of the differences were significant according to the Wilcoxon signed rank test (nonparametric test). Both our groups had comparatively fewer repeats for EP400 loci (Table 4). The probable reason for this difference is the use of different sequencing technologies; short-read sequencing was employed for the 1000 Genomes Project data. While short-read sequencing has its advantages, it also has some inherent inefficiency in regard to capturing long-range repeats and complex genomic regions.

Table 4 Repeat length variability in 1000 Genome subpopulations.

Full size table

Analysis of the expression levels of genes harboring unstable repeats

For all the candidate genes, the bulk tissue gene expression of each gene was compared among different tissues using GTEx¹¹. The analysis showed that the CNKSR2, MAB21L1, USF3, RAI1, NCOR2, JPH3, MAML3, EP400, and GLS genes were significantly highly expressed in the brain, particularly in the cerebellum. All the other genes, except for MIR205HG, also exhibited significant expression levels in the brain (Table 2). Since the pathogenesis of SCA is associated with the brain, we excluded MIR205HG from the gene shortlist. Thus, we proposed the pathogenicity of the remaining 18 genes, which might show an ataxia phenotype.

Discussion

Repeat instability is an underlying mutation mechanism for several neurodegenerative disorders in humans. Understanding the mechanism of repeat instability in disease manifestation has always been challenging. Several distinct hypotheses on repeat expansion have been proposed over the years, but its mechanism is not fully understood^{2,12,13,14,15}. Repeat instability in spinocerebellar ataxia is the most prevalent genetic manifestation worldwide. Identifying repeat expansion regions has always been challenging. In recent years, long-read next-generation sequencing has been an effective method for identifying these targets, but this method is costly and requires a large setup and personnel with highly qualified expertise. Here, we used a cost-effective alternative approach for the investigation of tandem nucleotide repeats.

The initial phase of the study utilized a computational approach, yielding 52 suspected CNG repeat loci from various genes for further investigation in the Indian control population. Using a cost-effective fluorescent PCR-based fragment analysis approach identified 19 conclusive highly polymorphic repeat targets after screening the control samples.

Genetic markers for the same disorder have been shown to be expressed among various populations in diverse ways, with some diseases and genetic markers being population specific. Therefore, in the second phase of the study, we screened these putative candidates in patients with genetically uncharacterized clinically confirmed SCA. Although no large expansion of these target loci was identified in the study population, repeat polymorphisms in other populations of the 1000 Genomes Project were used as a proof of concept. We evaluated all identified unstable markers in different major populations and our control and patient samples to understand the population variability among these loci. We found repeat data for 15 of the 19 selected CNG loci in the 1000 Genome STR database.

Additionally, the GTEx data showed that, except for MIR205HG, the remaining 18 loci were expressed in various brain tissues, making them more suitable for further investigations. However, of the 18 identified highly unstable repeat loci, none exhibited large repeat expansions in our patient population.

Multiple studies published in recent years on point and repeat expansion mutations for various neuro-related disorders from the proposed list of 18 genes support our adopted strategy in this study^16,17,18,19. Variation in the length of CAG repeats in the RAI gene is associated with differences in age at onset in spinocerebellar ataxia type 1 patients among various populations¹⁶. In 2019, Rad et al. reported that a point mutation in MAB21L1 causes a syndromic neurodevelopmental disorder with distinctive cerebellar, ocular, craniofacial, and genital features (COFG syndrome)²⁰. Another study suggested that a point mutation in CNKSR2 is associated with seizures and mild intellectual disability²⁰. In 2020, a report was published suggesting that frameshift mutations of GLI3, ANKUB1, and TAS2R3 might alter protein functions and accelerate the progression of polysyndactyly (PSD), an autosomal dominant genetic limb malformation²¹. The EP400 gene has been proposed to play a significant role in oligodendrocyte survival and myelination in the vertebrate central nervous system²². One study proposed that differences in the polyglutamine repeat length in MED15 change the expression of diverse stress pathways¹⁷. In various populations, CAG repeat variation in MEF2A is a risk factor for coronary artery disease (CAD)¹⁸. A study published in 2020 suggested that a CGG repeat expansion mutation in the 5’UTR of GIPC1 causes oculopharyngodistal myopathy (OPDM), an adult-onset inherited neuromuscular disorder¹⁸. A large GCA tandem expansion in the 5’ UTR of the GLS gene causes overall developmental delay, progressive ataxia, and elevated levels of glutamine¹⁹. Reported studies of GLS, GIPC1, MED15, RAI1, and MEF2A included the same candidate loci that we identified in our study^7,13,14,15. Although we did not identify any large repeat expansions, this previously reported evidence strengthens our study, indicating that our approach is in the right direction for the discovery of novel targets.

Limitations of the study

1.
We considered only CNG repeats in the coding region and UTR with at least 10 continuous repeats due to the larger number of target loci. Considering other tri-, tetra-, penta-, and hexa-repeat units and loci with lower repeat numbers increases the chances of obtaining causal mutations.
2.
We collected 100 patient samples for the study. SCA is a rare disorder, and its subtypes are very rare; therefore, a larger sample size will provide more confidence in our hypothesis.
3.
Most SCA subtypes are geographic and population specific. In this study, we considered only North Indian SCA patient samples, and a multipopulation study could enhance the possibility of identifying causal mutations among the studied genes.

This study highlights the importance of the population polymorphism approach for understanding the genetic background and mechanism of tandem repeat instability in ataxia-like neurological disorders. The role of other repetitive sequences in both coding and noncoding regions in the context of neurological disorders can be explored with the help of computational and polymorphism approaches, as in this work.

Conclusion

Although our study did not positively identify any novel pathogenic CNG trinucleotide repeat expansions, it still describes an approach that utilizes population-level genomics data to address the complex genetic mechanisms underlying disease pathology. The list of novel unstable loci that we identified can be examined in other neurological and neuromuscular disease cohorts, and a larger sample size may lead to the discovery of pathogenic expansions at these loci.

Data availability

The data from this study related to the subjects and code will be available upon request to the corresponding author. The codes utilized in the study are available at https://github.com/bharathramh/STR_repeat/blob/main/str.py).

References

Ellerby, L. M. Repeat Expansion Disorders: Mechanisms and Therapeutics. Neurotherapeutics 16, 924–927 (2019).
Article PubMed Google Scholar
Paulson, H. Repeat expansion diseases. Handb. Clin. Neurol. 147, 105–123 (2018).
Article PubMed PubMed Central Google Scholar
Perlman S. Hereditary Ataxia Overview. 1998. In: Adam M. P., et al., GeneReviews®. Seattle (WA): University of Washington, Seattle; 1993–2023. https://www.ncbi.nlm.nih.gov/books/NBK1138/.
Sharma, P. et al. Genetics of Ataxias in Indian population: a collative insight from a common genetic screening tool. Adv. Genet (Hoboken) 3, 2100078 (2022).
Article PubMed Google Scholar
Ruano, L., Melo, C., Silva, M. C. & Coutinho, P. The global epidemiology of hereditary ataxia and spastic paraplegia: a systematic review of prevalence studies. Neuroepidemiology 42, 174–183 (2014).
Article PubMed Google Scholar
Pandey, N., Mittal, U., Srivastava, A. K. & Mukerji, M. SMARCA2 and THAP11: potential candidates for polyglutamine disorders as evidenced from polymorphism and protein-folding simulation studies. J. Hum. Genet. 49, 596–602 (2004).
Article CAS PubMed Google Scholar
Indian Genome Variation Consortium. The Indian Genome Variation database (IGVdb): a project overview. Hum. Genet. 118, 1–11 (2005).
Article Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Article PubMed PubMed Central Google Scholar
Rolfsmeier, M L. et al. Cis-elements governing trinucleotide repeat instability in Saccharomyces cerevisiae. Genetics 157, 4 (2001).
Article Google Scholar
Mary, E. T. et al. Rate-determining Step of Flap Endonuclease 1 (FEN1) Reflects a Kinetic Bias against Long Flaps and Trinucleotide Repeat Sequences. J. Biol. Chem. 290, 34 (2015).
Google Scholar
The Genotype-Tissue Expression (GTEx) project, www.gtexportal.org/home/gene
Fan, Y. et al. GGC repeat expansion in NOTCH2NLC induces dysfunction in ribosome biogenesis and translation. Brain 146, 3373–3391 (2023).
Article PubMed Google Scholar
Yabuki, Y. & Shioda, N. The neuropathological mechanism on guanine-rich repeat expansion diseases. Nihon Yakurigaku Zasshi. 158, 30–33 (2023).
Article CAS PubMed Google Scholar
Jain, A. & Vale, R. D. RNA phase transitions in repeat expansion disorders. Nature 546, 243–247 (2017).
Article CAS PubMed PubMed Central Google Scholar
Teng, Y., Zhu, M. & Qiu, Z. G-quadruplexes in repeat expansion disorders. Int. J. Mol. Sci. 24, 2375 (2023).
Article CAS PubMed PubMed Central Google Scholar
Polla, D. L., Saunders, H. R., de Vries, B. B. A., van Bokhoven, H. & de Brouwer, A. P. M. A de novo variant in the X-linked gene CNKSR2 is associated with seizures and mild intellectual disability in a female patient. Mol. Genet Genom. Med. 7, e00861 (2019).
Article Google Scholar
Gallagher, J. E. G., Ser, S. L., Ayers, M. C., Nassif, C. & Pupo, A. The polymorphic PolyQ tail protein of the mediator complex, med15, regulates the variable response to diverse stresses. Int J. Mol. Sci. 21, 1894 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zargar, S., Aljafari, A. A. & Wani, T. A. Variants in MEF2A gene in relation with coronary artery disease in Saudi population. 3 Biotech 8, 289 (2018).
Article PubMed PubMed Central Google Scholar
Deng, J. et al. Expansion of GGC Repeat in GIPC1 is associated with oculopharyngodistal myopathy. Am. J. Hum. Genet. 106, 793–804 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rad A., et al. MAB21L1 loss of function causes a syndromic neurodevelopmental disorder with distinctive cerebellar, ocular, craniofacial and genital features (COFG syndrome). J Med. 56, 332–339 (2019).
Zhang, L. et al. Novel frameshift mutations of ANKUB1, GLI3, and TAS2R3 associated with polysyndactyly in a Chinese family. Mol. Genet. Genom. Med. 8, e1223 (2020).
Article CAS Google Scholar
Elsesser, O. et al. Chromatin remodeler Ep400 ensures oligodendrocyte survival and is required for myelination in the vertebrate central nervous system. Nucleic Acids Res. 47, 6208–6224 (2019).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We acknowledge the funding from the CSIR-Young Scientist Project (OLP1123), and we acknowledge ICMR for the fellowship support for Varun Suroliya. We thank UGC for the fellowship support for Manish Kumar. We thank CSIR-IGIB for the technical and scientific support. Additionally, at the Ataxia Clinic, we thank AIIMS for providing samples for this study. We are sincerely thankful to the patients and their families for their participation.

Funding

We acknowledge the funding from the CSIR-Young Scientist Project (OLP1123) and MLP1601 and MLP1802.

Author information

Authors and Affiliations

Department of Neurology, All India Institute of Medical Sciences, Ansari Nagar, Delhi, 110020, India
Varun Suroliya & Achal K. Srivastava
Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology, Mall Road, Delhi, 110007, India
Bharathram Uppili, Manish Kumar & Mohammed Faruq
Academy for Scientific and Innovative Research, Ghaziabad, 201002, India
Bharathram Uppili & Manish Kumar
Persistent LABS, Persistent Systems Ltd., Pune, Maharashtra, India
Vineet Jha

Authors

Varun Suroliya
View author publications
You can also search for this author in PubMed Google Scholar
Bharathram Uppili
View author publications
You can also search for this author in PubMed Google Scholar
Manish Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Vineet Jha
View author publications
You can also search for this author in PubMed Google Scholar
Achal K. Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Faruq
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Study design and conceptualization: MF; Experimental execution: VS and MK; Computational analysis: VS, BU and VJ; Manuscript writing: VS, BU and MF; Critical review and supervision: AKS and MF.

Corresponding author

Correspondence to Mohammed Faruq.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

The study was performed under ethical clearance from the Institute Human Ethics Committee (GOMED-MLP1601/2016 and GOMED-MLP1802/2018). Consent to participate was obtained from the participants.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Files

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Suroliya, V., Uppili, B., Kumar, M. et al. Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders. Hum Genome Var 11, 25 (2024). https://doi.org/10.1038/s41439-024-00281-0

Download citation

Received: 03 January 2024
Revised: 08 May 2024
Accepted: 14 May 2024
Published: 13 June 2024
DOI: https://doi.org/10.1038/s41439-024-00281-0
Springer Nature Limited

Editorial Summary

Potential risk alleles in neurological diseases: CNG repeat analysis

Spinocerebellar ataxia and similar muscle and nerve disorders are caused by certain repeated DNA sequences expanding, disrupting normal cell function. In this study, researchers used computer methods to find new DNA repeat expansions that could cause these disorders. They combined computer analysis with genetic testing in a study involving 100 patients with unexplained muscle and nerve disorder symptoms. This method aimed to find cost-effective ways to identify disease-causing genetic changes. The results showed several new unstable DNA repeat expansions that could potentially cause muscle and nerve disorders, although none were definitively linked to the disease in this patient group. The researchers conclude that their method offers a promising way to discover new genetic causes of muscle and nerve disorders. This approach could lead to better diagnosis and understanding of these diseases in the future.

This summary was initially drafted using artificial intelligence, then revised and fact-checked by the author.

Identifying unstable CNG repeat loci in the human genome: a heuristic approach and implications for neurological disorders

Abstract

Similar content being viewed by others

Genome-wide detection of short tandem repeat expansions by long-read sequencing

Abundancy of polymorphic CGG repeats in the human genome suggest a broad involvement in neurological disease

C9orf72 intermediate expansions of 24–30 repeats are associated with ALS

Introduction

Methodology