Whole exome sequencing of Rett syndrome-like patients reveals the mutational diversity of the clinical phenotype

Classical Rett syndrome (RTT) is a neurodevelopmental disorder where most of cases carry MECP2 mutations. Atypical RTT variants involve mutations in CDKL5 and FOXG1. However, a subset of RTT patients remains that do not carry any mutation in the described genes. Whole exome sequencing was carried out in a cohort of 21 female probands with clinical features overlapping with those of RTT, but without mutations in the customarily studied genes. Candidates were functionally validated by assessing the appearance of a neurological phenotype in Caenorhabditis elegans upon disruption of the corresponding ortholog gene. We detected pathogenic variants that accounted for the RTT-like phenotype in 14 (66.6 %) patients. Five patients were carriers of mutations in genes already known to be associated with other syndromic neurodevelopmental disorders. We determined that the other patients harbored mutations in genes that have not previously been linked to RTT or other neurodevelopmental syndromes, such as the ankyrin repeat containing protein ANKRD31 or the neuronal acetylcholine receptor subunit alpha-5 (CHRNA5). Furthermore, worm assays demonstrated that mutations in the studied candidate genes caused locomotion defects. Our findings indicate that mutations in a variety of genes contribute to the development of RTT-like phenotypes. Electronic supplementary material The online version of this article (doi:10.1007/s00439-016-1721-3) contains supplementary material, which is available to authorized users.

Introduction described in the 1960s by Andreas Rett (Rett 1966), that most frequently manifests itself in girls during early childhood, with an incidence of approximately 1 in 10,000 live births (Chahrour and Zoghbi 2007). RTT patients are asymptomatic during the first 6-18 months of life, but gradually develop severe motor, cognitive, and behavioral abnormalities that persist for life. It is the second most common cause of intellectual disability in females after Down's syndrome (Chahrour and Zoghbi 2007). Around 90 % of the cases are explained by more than 800 reported mutations in the methyl CpG-binding protein 2 gene (MECP2) (RettBASE: MECP2 Variation Database) (Christodoulou et al. 2003), which is located in the X chromosome and which causes most of the classical or typical forms of RTT (Chahrour and Zoghbi 2007), and it was originally identified as encoding a protein that binds to methylated DNA (Lewis et al. 1992). Individuals affected by atypical or variant RTT present with many of the clinical features of RTT, but do not necessarily have all of the classic characteristics of the disorder (Neul et al. 2010). Approximately 8 % of classic RTT and 42 % of variant RTT patients are MECP2 mutationnegative (Monros et al. 2001;Percy 2008). Some of the latter group have mutations in other genes, such as that of the cyclin-dependent kinase-like 5 (CDKL5), which is described in individuals with an early seizure onset variant of RTT (Kalscheuer et al. 2003) or the forkhead box G1 (FOXG1), which is responsible for the congenital variant of RTT (Ariani et al. 2008). However, there remains a subset of patients with a clinical diagnosis of RTT who are mutation-negative for all the aforementioned genes. Next generation sequencing (NGS) has emerged as a potentially powerful tool for the study of such genetic diseases (Zhu et al. 2015).
Herein, we report the use of a family based exome sequencing approach in a cohort of 20 families with clinical features of RTT, but without mutations in the usually studied genes. We establish the neurological relevance of the newly identified candidate genes by assessing them in Caenorhabditis elegans model.

Patient samples
A cohort of 19 Spanish parent-child trios and one family with two affected daughters who exhibited clinical features associated with RTT were recruited at Sant Joan de Deu Hospital in Barcelona, Catalonia, Spain. These patients had been diagnosed on the basis of the usual clinical parameters (Monros et al. 2001), and according to the recently revised RettSearch International Consortium criteria and nomenclature (Neul et al. 2010), but were found to be mutation-negative for MECP2, CDKL5 and FOXG1 in the original single-gene screening. The parents were clinically evaluated and it was not observed any evidence of intellectual disability. Genomic DNA from these patients was extracted from peripheral blood leukocytes using standard techniques, and analyzed by exome sequencing at the Cancer Epigenetics and Biology Program (PEBC) in Barcelona, Catalonia, Spain. Ethical approval for the molecular genetic studies was obtained from each institutional review board.

Whole exome sequencing and Sanger validation
Coding regions were captured using the TruSeq DNA Sample Preparation and Exome Enrichment Kit (Illumina, San Diego, California). Paired-end 100 × 2 sequences were sequenced with the Illumina HiScan SQ system at the National Center for Genomic Analysis in Barcelona. We also included the exome sequencing data of an MECP2, a CDKL5 and a JMJD1C (Sáez et al. 2016) RTT-associated family for data processing to improve the de novo single nucleotide variant calling. The complete exome sequencing data of all the studied samples are available from the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra) with the ID: SRP073424 (private link for the reviewer until publication: http://www.ncbi.nlm.nih.gov/sra/SRP073424). The overall coverage statistics for each individual of the families, considering the regions captures using Exome Enrichment Kit, and number of reads in the position of the variation is shown in Supplementary

Caenorhabditis elegans handling
The techniques used for the culture of Caenorhabditis elegans were essentially as described (Brenner 1974). The worms were backcrossed at least three times to avoid background mutations. The behavior of three sets of ten animals was independently assessed in locomotion assays without food that were performed at 20 °C, as previously described (Sawin et al. 2000).

Clinical criteria for selecting RTT trios
The 21 patients (derived from the 20 families studied) included in this study fulfilled the recently revised clinical criteria for the diagnosis of RTT following the usual clinical parameters (Monros et al. 2001), and the RettSearch International Consortium criteria and nomenclature (Neul et al. 2010). Specifically, all patients presented stereotypic hand movements, 90.5 % of them (19/21) showed microcephaly and also presented onset of the first signs of the disease before the age of 12 months. 66.7 % of patients (14/21) acquired motor skills, while a further seven (33.3 %), who had a more severe phenotype, never walked. Language skills were progressively lost in 28.6 % of the patients and 71.4 % of them (15/21) never acquired them. Additionally, important episodes of epilepsy were experienced by 81.0 % of the patients (17/21), and 57.1 % of them (12/21) manifested apneas and/or hyperventilation.

Bioinformatic process for filtering and selecting pathogenic variants
Before their inclusion in this study, patients underwent an extensive clinical and genetic work-up to detect genetic alterations in MECP2, CDKL5, and FOXG1. However, no molecular diagnosis could be established. We performed whole exome sequencing (WES) on the 61 individuals (20 pairs of healthy parents and 21 affected daughters) separately by subjecting whole blood derived genomic DNA to exome enrichment and sequencing. We focused our analysis on de novo single nucleotide variants (SNVs) due to their known relevance in autism and mental retardation-related diseases (Vissers et al. 2010). On average, WES gave rise to 419,045 variants, including SNVs and indels, of which 19,951 non-synonymous variants per family (4.7 %) were predicted to have a functional impact on the genomic sequence. To select variants that had not previously been described in the healthy population, we filtered out the variants with an allele frequency of 1 % or higher (the classic definition of a polymorphism) formerly observed in the Single Nucleotide Polymorphism database (dbSNP) and the 1000 Genomes Pilot Project data. Afterwards, to focus on de novo inheritance, patients' variants were filtered first against variants found in their own parents and then against a pool of controls comprising all the healthy parents included in the study. Following this process, we achieved an average of 106 SNVs per family, which corresponded to 81 mutated genes per family. De novo candidate variants were selected on the basis of the quality of the alignments, damage score predictors and the conservation level of each of the genes during evolution. The complete exome sequencing data of all the studied samples are available from the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra).
The global yield of genomic analysis following the bioinformatic process described herein enabled 22 coding de novo mutations to be identified in 66.7 % (14 of 21) of Rett-like patients: 20 SNVs and 2 indels. The identified variants and their de novo status were confirmed by conventional Sanger sequencing. Illustrative samples are shown in Fig. 1. Interestingly, in seven (33.3 %) of the studied RTT probands, exome sequencing did not detect any genetic change relative to their respective parents. The clinical characteristics of these seven patients without obvious pathogenic variants are summarized in Table 1. In one of the families, there were two affected children, and an analysis of potentially relevant recessive variants was performed. For the recessive analysis, and following the same criteria to define a variant as deleterious, we selected the variants with homozygous recessive genotype, and then at the gene level, we also selected the genes presenting more than one heterozygotic variant in the same gene (compound heterozygosity). We did not find any candidate gene consistent with the phenotype of the family with the two affected sisters.

Variants in genes previously associated with neurodevelopmental disorders
Of the 22 identified coding de novo mutations in the assessed RTT-like patients, five (22.7 %) occurred in genes previously associated with neurodevelopmental disorders that presented a clinicopathological phenotype with features coinciding with those of Rett syndrome (Table 2). In particular, we identified four mutations in genes such as HCN1 (Nava et al. 2014) and GRIN2B (Endele et al. 2010;Lemke et al. 2014), which are associated with early infantile epileptic encephalopathy; SLC6A1, which is associated with epilepsy and myoclonic-atonic seizures (Carvill et al. 2015); TCF4, which is associated with Pitt-Hopkins syndrome (Sweatt 2013); and SCN1A, which is associated with Dravet syndrome (Brunklaus and Zuberi 2014) ( Table 2). The clinical characteristics of these five patients with variants in genes previously associated with neurodevelopmental phenotypes are summarized in Table 3. A comparison of the clinical features of our RTT-like patients, where we have identified mutations in candidate genes previously associated with other neurodevelopmental disorders, with those observed for these diseases is summarized in Table 4.

Variants in genes previously not associated with neurodevelopmental disorders
Of the 22 identified coding de novo variants in the RTTlike patients assessed here, 17 (77.3 %) occurred in genes that had not previously been associated with neurodevelopmental disorders (Table 5). However, two of these variants were associated with non-neurodevelopmental disorders: a BTBD9 variant linked to restless leg syndrome (Kemlink et al. 2009), and an ATP8B1 SNV associated with familial cholestasis (Klomp et al. 2004), respectively. Interestingly, the BTBD9 variant was detected in the same patient that carried the SCN1A variant associated with Dravet syndrome ( Table 2). The other 15 potentially pathogenic variants identified occurred in genes that had not been linked to any genetic disorder of any type. However, there was an enrichment of genes with a potential role in neuronal biology and functionality, such as the gamma-aminobutyric type

Father
Mother Proband

Neurological phenotype of candidate genes in C. elegans
To demonstrate a neurological effect for a loss of function of the detected genes that had not previously been associated with neurodevelopmental disorders (Table 5), we used the model organism C. elegans to confirm the genotype-phenotype correlation. We obtained all the available C. elegans mutants that carry deleterious mutations in the orthologous genes to those human genes with potentially pathogenic mutations in the patients. In this model, backcrossing is a commonly used procedure to obtain a specific mutant strain without any secondary mutations from its genetic composition. Under these conditions, we were able to test six available mutant strains that were backcrossed at least three times to prove that any observed phenotype was really associated to specific mutations in the orthologous genes. To this end, we studied the C. elegans mutants carrying deleterious mutations in the gene orthologs of the human genes PDLIM7, ANKRD31, ZNF620, CHRNA5, MGRN1 and GABBR2 described in Table 7. Considering that the loss of normal movement and coordination is one of the clearest signs shown by Rett patients, we performed a locomotion assay of the nematodes as previously described (Sawin et al. 2000), using the wild-type N2 strain as a control (Supplementary Video 1). We observed that in 83.3 % (5 of 6) of the cases the mutation of the ortholog of the human exome sequencing identified genes in C. elegans exhibited a locomotion defective phenotype (Fig. 2). The most severe phenotypes were represented by alp-1, unc-44 and pag-3, with mutations in the orthologs of PDZ and LIM domain protein 7 (PDLIM7), ankyrin repeat containing protein ANKRD31 and the zinc protein ZNF620, respectively ( Fig. 2 and Supplementary Videos 2, 3 and 4). The case of alp-1 was particularly interesting, because mutant worms were not only thinner than usual and completely locomotion defective, but they exhibited transitory spasms. Significant defects, such as slower locomotion and uncoordinated movement, were also observed in the mutants of unc-63 and C11H1.3, the C. elegans orthologs Myoclonic-atonic epilepsy and schizophrenia of the genes coding for the neuronal acetylcholine receptor subunit alpha-5 (CHRNA5) and mahogunin RING finger protein 1 (MGRN1), respectively. Although we did not find a clear locomotion defect in the gbb-2 mutant (the ortholog of GABBR2) (Fig. 2), it occurs in the gbb-1;gbb-2 double mutant (Dittman and Kaplan 2008), gbb-1 being the C. elegans ortholog of GABBR1 (gamma-aminobutyric acid type B receptor subunit 1). The clinical picture of the particular RTT cases with mutations in the genes studied in C. elegans is shown in Table 6.

Discussion
Our results indicate that the existence of de novo variants in genes with potential neurological functionalities, such as neuronal receptors (GABBR2 and CHRNA5), axon guiders (SEMA6B), synaptic ionic channels (CAC-NA1I) and others, contribute to the development of RTT-like clinical phenotypes in the context of wild-type sequences for standard Rett genes such as MECP2 and FOXG1. These patients share most of the clinicopathological features of classic RTT syndrome, such as stereotypic hand movements, relative microcephaly, and onset of the disease after the age of 12 months. Thus, exome sequencing is a powerful tool for genetically characterizing these enigmatic cases. In this regard, once a new candidate gene has been identified, it is now possible to design specific sequencing strategies for the molecular screening of this particular target in larger populations of patients with intellectual disability. The strategy based on exome sequencing patients who have RTT features, but no known mutations in the usual genes, has recently been used in other smaller series of patients (Grillo et al. 2013;Okamoto et al. 2015;Hara et al. 2015;Olson et al. 2015;Lopes et al. 2016). Most importantly, our study and the aforementioned previous reports strengthen the concept that a mutational heterogeneous profile hitting shared neurological signaling pathways contributes to RTT-like syndromes. Examples of confluence in the same molecular crossroads include the gamma-aminobutyric type B receptor subunit 2 (GABBR2) de novo variant, described here, and the formerly identified variant in the gamma-aminobutyric acid receptor delta gene (GABRD) (Hara et al. 2015). Interestingly, a second RTTlike patient has been identified as being a carrier of a de novo GABBR2 variant (Lopes et al. 2016), highlighting the likelihood that this gene and pathway contribute to the clinical entity. Another example of similarly targeted genes in RTT-like patients is that of the proteins containing ankyrin-repeats that are involved in postsynaptic density (Durand et al. 2007). This study has revealed de novo variants in the ankyrin repeat containing proteins AGAP6 and ANKRD31 in RTT-like patients, and the presence of de novo variant of the SH3 and multiple ankyrin repeat domain3 protein (SHANK3) (Hara et al. 2015) and ankyrin-3 (ANK3) (Grillo et al. 2013) has been reported in two RTT-like patients. A final example of the convergence of cellular pathways to provide a common RTT-like phenotype is represented by the disruption of the ionic channels. We found the existence of a voltage-gated calcium channel subunit alpha 11 (CANA1I) de novo variant in an RTT-like patient. Additionally, the presence of de novo variants in the calcium release channel RYR1 (Grillo et al. 2013) and the sodium voltage-gated channel alpha subunit 2 (SCN2A) (Baasch et al. 2014) in two other RTT-like probands have been reported. It is also intriguing that in our study a variant in HAP was found, whereas in similar series heterozygous variants in huntingtin (HTT) have been described (Lopes et al. 2016;Rodan et al. 2016), further reinforcing the links between Huntington's disease and Rett syndrome (Roux et al. 2012). Another interesting case is provided by TCF4, which is associated with Pitt-Hopkins syndrome (Sweatt 2013), where in addition to our study, others have found mutations in RTT-like patients (Lopes et al. 2016). This observation could be of interest for clinicians due to phenotypic similitudes such as intellectual disability, stereotypic movement, apneas and seizures (Marangi et al. 2012).
Our findings also suggest that a substantial degree of clinical overlap can exist between the features associated with RTT and those of other neurodevelopmental disorders. Our exome sequencing effort indicated that probands originally diagnosed as RTT-like patients were, in fact, carriers of well-known pathogenic de novo mutations linked to Dravet Syndrome (SCN1A), myoclonic-atonic epilepsy (SCLC6A1), or early infantile epileptic encephalopathies 24 (HCN1) and 27 (GRIN2B). The purely clinical classification of these patients, without a thorough genetic study, can be difficult because some of these patients are composites that carry at least two pathogenic variants. For example, in our cases, the Dravet syndrome patient also had a de novo variant in BTBD9 associated with the development of restless leg syndrome. In addition, among the newly identified candidate genes associated with RTT-like features, a few of these patients simultaneously carried two de novo variants (e.g., probands 8, 19 and 21), further complicating the tasks of correctly diagnosing and managing these individuals.
Finally, the studies performed in C. elegans validate the functional relevance for nervous system function of the newly proposed candidate genes. Future studies would be necessary to assess the role of the specific variants identified, such as rescuing the defects with the expression of normal cDNAs versus cDNAs containing the mutation, ideally using cDNAs of human origin to prove similar function of the gene in the two species. It is also relevant to mention that for some of the newly reported mutated genes in our RTT-like patients, there are mice models targeting the described loci that show neurological phenotypes such as BTBD9 (motor restlessness and sleep disturbances) (DeAndrade et al. 2012), MGRN1 (spongiform neurodegeneration) (He et al. 2003), SEMA6B (aberrant mossy fibers) (Tawarayama et al. 2010), CHRNA5 (alterations in the habenulo-interpeduncular pathway) (Fowler et al. 2011), GABBR2 (anxiety and depression-related behavior) (Mombereau et al. 2005) and HAP1 (depressive-like behavior and reduced hippocampal neurogenesis) (Chan et al. 2002;Xiang et al. 2015).

Conclusions
Overall, this study demonstrates the genetic mutational diversity that underlies the clinical diagnosis of patients with clinical features that resemble RTT cases. Once the recognized MECP2, CDKL5 and FOXG1 mutations have been discarded, exome sequencing emerges as a very useful strategy for the more accurate classification of these patients. The de novo variants identified by this approach can modify the first diagnostic orientation towards another neurodevelopmental disorder, or pinpoint new genes involved in the onset of RTT-like features. Interestingly, most of these new targets are involved in the same functional networks associated with correct neuronal functionality. Further research is required to understand the role of these proteins in the occurrence of neurodevelopmental diseases. Additional functional experiments, such as the C. elegans assays used in this study, would be extremely helpful for this purpose.