Using Genome Engineering to Understand Huntington ’ s Disease

Huntington ’ s disease (HD) is a fatal, dominantly inherited neurodegenerative disorder caused by a CAG trinucleotide expansion in the Huntingtin (HTT) gene, leading to an expanded polyglutamine (polyQ) region in the encoded protein HTT. We have used homologous recombination (HR) to genetically correct HD patient-derived induced pluripotent stem cells (iPSCs) and found that this reversed HD disease phenotypes. We have utilized exploited genome editing tools including TALENs (Transcription like activator effectors) and CRISPR (Clustered Regulatory Interspaced Short Palindromic Repeats)/Cas9 technology to carry out genetic correction or expansion, and we were able to detect HR without selection in human cells. The overall goal is to use this technology to model HD-relevant cell types and better understand disease progression by leveraging system biology approaches. To understand the disease progression, isogenic iPSC lines were created. We found that the disease phenotypes only manifested in the differentiated neural stem cell (NSC) stage, not in iPSCs. Transcriptomic analysis of HD iPSCs and HD NSCs compared to isogenic controls was utilized to understand the molecular basis for the CAG repeat expansion-dependent disease phenotypes in NSCs. Differential gene expression and pathway analysis identiﬁed transforming growth factor β (TGF- β ) signaling, netrin-1 signaling and medium spiny neuron (MSNs) maturation and maintenance as the top dysregulated pathways in HD NSCs. The ability to create additional isogenic cell lines through CRISPR-mediated HR will further enhance our understanding of HD progression. These lines can be manipulated with CRISPR to understand the effects of common SNPs (single nucleotide polymorphism) that modulate disease onset in HD, allowing the identiﬁcation of new pathways and helping to elucidate potential therapeutic targets for HD. Beyond drug discovery, the CRISPR system could eventually be optimized to use in vivo, correcting a patient ’ s disease-causing mutation, in the asymptomatic stages of


Huntington' Disease
Huntington's disease (HD) is a devastating, dominantly inherited movement and psychiatric disorder that is caused by expansion of a CAG trinucleotide repeat in the first exon of the Huntingtin gene (HTT), resulting in translation of an expanded polyQ repeat in the HTT protein. The production of the abnormal expanded polyQcontaining HTT protein leads to a dramatic loss of striatal and cortical neurons and pro-survival growth factors such as BDNF (brain derived neurotrophic factor) in HD patients. The polyQ expansion in the HTT protein leads to disrupted cellular homeostasis and activation of cellular death pathways (Fig. 1). Since the disease is inherited in an autosomal dominant fashion, each child of an affected parent has a 50% chance of being affected. HD generally manifests in mid-life, with a mean age of onset of 35-45 years of age. The disease begins with cognitive disturbances and progresses to severe and debilitating motor symptoms (chorea) usually accompanied by psychiatric disturbances, with death following in about 15-20 years (Landles and Bates 2004). The current therapeutic approaches in HD focus on normalizing molecular pathways disturbed in HD or on lowering the levels of the mutant HTT protein (Canals et al. 2004;Conforti et al. 2008;Zuccato et al. 2008). To date none of these approaches are approved for use outside of clinical trials and they will not cure the disease.
In this chapter, we discuss the use of gene editing tools to model neurological diseases such as HD as well as the potential to use this technology to treat genetic neurological diseases. Fig. 1 Illustration on the neuronal changes occurring in the striatum of a Huntington's disease patient. The exon 1 CAG expansion in the HTT allele results in a mutant protein being formed; the mutant protein aggregates and is also cleaved into toxic fragments. The aggregates and the toxic fragments result in a disrupted cellular homeostasis and eventual neuronal cellular death in the striatum

Gene Editing Enzymes
Targeted gene editing has evolved dramatically in the last 25 years. While originally a technique that a handful of laboratories had mastered, it is now a common tool used in hundreds of laboratories around the world. One family of gene editing proteins is the customized zinc finger proteins (Segal and Barbas 2000;Wolfe et al. 2000;Pabo et al. 2001;Nagaoka and Sugiura 2000). These proteins were adapted for targeted use in the late 1990s (Liu et al. 1997;Segal et al. 1999;Dreier et al. 2001). Each zinc finger protein could be designed to recognize three different base pairs on DNA through various interactions between the proteins alpha helix amino acids and the DNA base pairs (Segal and Barbas 2001). To recognize a specific sequence of DNA, the zinc fingers could be attached to each other, with six zinc fingers recognizing a unique 18-base pair sequence in an organism's genome. The zinc finger proteins could have effector or nuclease domains attached, allowing for gene regulation or gene replacement. The effector domains included VP64 for gene activation, KRAB for gene silencing and DNMT1 for methylation (Beerli et al. 1998;Rivenbark et al. 2012). The nuclease domain could cut targeted genomic sites and allow for mutagenesis or homologous recombination at enhanced efficiency. Zinc finger proteins have been successfully used in human cells, animal organs and have reached Phase II human clinical trials (Geurts et al. 2009;Urnov et al. 2005;SangamoBiosciences 2001;Eisenstein 2012). Although promising, zinc fingers presented several challenges for researchers. Their targeting ability was limited, they required specialized design techniques and they exhibited a frequent incidence of off-target events (Cornu and Cathomen 2010;Gupta et al. 2010;Gabriel et al. 2011). Some advances have been made to reduce the off-target potential and increase detection of these events (Zykovich et al. 2009;Cornu et al. 2008). The therapeutic potential of zinc fingers for a variety of diseases, including HD, continues to be explored by the biotechnology company Sangamo (Cornu et al. 2008;Wolffe 2016).
In 2009, a new gene editing protein was described, transcription activator-like effectors (TALEs; Boch et al. 2009;Moscou and Bogdanove 2009). These proteins were originally characterized in Xanthomonas bacteria and represented a major advance for DNA regulating proteins. TALEs, unlike zinc fingers, made contact with individual DNA base pairs, which greatly expanded the sequences that could be targeted in the genome (Moscou and Bogdanove 2009). They were also much easier to design and assemble. Much like zinc fingers, TALEs could have effector or nuclease domains attached to the DNA binding domain, allowing for the DNA to be cut or for genes to be regulated (Christian et al. 2010;Maeder et al. 2013a, b;Cong et al. 2012). Promising experiments in a variety of organisms have validated the efficacy of TALEs, although no human clinical trials have begun. A recent publication has shown the ability of TALEs to specifically silence the mutant HTT allele in cell culture models or to engineer an allelic series into the HTT locus Wang et al. 2013). The TALEs still exhibit off-target effects and may have potential immune issues (Guilinger et al. 2014).
Gene editing became a widely accessible technology in 2012 with the characterization of the CRISPR system and its implications for targeted gene editing and regulation. The CRISPR system is composed of a Cas9 nuclease and a gRNA complex. To cut the DNA, Cas9 attaches to the guide RNA (gRNA), which targets a specific site in the organism's DNA (Jinek et al. 2012;Wiedenheft et al. 2012). This system is found in archea and bacteria and is used as a natural defense mechanism against bacteriophages. The system has been characterized and adapted for mammalian-targeted genome editing. The gRNA has one targeting requirement, a PAM motif (typically a NGG) at the 3 0 end of the DNA targeting site; this sequence is common in DNA and thus almost any gene can be targeted with the CRISPR system Qi et al. 2013). As with previous gene editing proteins, the Cas9 can be modified to either silence or activate gene transcription (Fig. 2;Sander and Joung 2014;Larson et al. 2013). Due to some initial off-target cleavage events, the Cas9 nuclease was modified to become a Cas9 nickase (Cas9n; Ran et al. 2013). This modification drastically increased targeting specificity, as the binding of two Cas9n proteins targeting two different DNA sites was required to make a double strand break in the DNA and encouraged homologous recombination (HR) with a potential donor DNA strand. Overall the off-target effects of Cas9n could be reduced to background levels (O'Geen et al. 2015;Wu et al. 2014). The modified Cas9n was found to have similar cleavage efficiency when two gRNAs were used, one targeted on each strand of the DNA, resulting in a double strand break. The technique has been widely adopted to create disease-modeling cell lines, rodent and non-human primate models and in non-viable human embryos   (Liang et al. 2015;Chen et al. 2015). What has made the CRISPR system so accessible is that, unlike the zinc fingers and TALEs, the same core protein, Cas9, is used to target any sequence, whereas the targeting portion of the CRISPR system, the gRNA, is what varies. The gRNA can be designed and synthesized either in a standard lab or by an outside company. This separation of the targeting portion (gRNA) of the CRISPR system from the modifying portion (Cas9 or other effectors) allows for targeting multiple genes in one experiment . The ability to target multiple genes in a single experiment drastically reduces the time needed to model complex genetic disorders in which more than one gene is involved. All of these unique characteristics have resulted in a rapid popularization of the CRISPR system in research labs, with thousands of papers having been published in the last five years.

Uses for Gene Editing to Understand Human Diseases
Due to their ability to precisely target a gene or regulatory element, genome editing tools have been widely utilized to model human diseases both in cells and in animals. Neurodegenerative diseases such as Parkinson's disease and HD have been modeled by introducing disease-causing mutations into human induced pluripotent stem cells (iPSCs) facilitated by genome editing tools Soldner et al. 2011). CRISPR/Cas9 or TALENs can also be injected into zygotes or embryos to get genetically modified animals. Researchers have injected TALEN-expressing mRNAs into zebrafish embryos to target the gene glucocerebrosidase 1, which is mutated in the lysosomal storage disorder Gaucher's disease. The introduction of these TALENs caused a deletion mutation of the protein Glucocerebrosidase 1, and characteristics of the Gaucher's disease were present in this zebrafish model (Keatinge et al. 2015). Duchenne muscular dystrophy (DMD) is a neuromuscular disorder caused by a loss-of-function mutation of the gene dmd. A DMD rat model was generated by delivering CRISPR system into rat zygotes to target the dmd gene (Nakamura et al. 2014). These disease models are valuable tools for the exploration of disease mechanisms and for the pursuit of therapeutics.
When combined with human pluripotent stem cells, genome editing tools can provide some unique advantages in disease modeling and mechanism study. Human pluripotent stem cells, including iPSCs and embryonic stem cells, can be directed to any cell types of the human body with the correct differentiation conditions. Thus relevant cell types for the disease and changes in this development can be studied in these models. When genome editing tools are used to add or remove a mutation at the pluripotent stem cell stage, isogenic cell lines with an almost identical genetic background are obtained. As cells are differentiated into more restricted stem cells and terminally differentiated cells, the isogenic background will persist. Phenotypic changes of these cells are most likely a result of the mutation, as they have an identical genetic background. However, one may still have to consider epigenetic changes and mitochondrial mutations that may remain harbored in the patient's iPSCs' background (Chinnery et al. 2012;Calvanese et al. 2009). These isogenic cell lines can be subjected to systematic approaches including DNA microarray, RNA-seq and mass spectrometry for transcriptomic and proteomic information. Bioinformatic analysis can identify interesting gene/protein targets or signaling pathways that have distinct diseaseassociated patterns. The cleaner background of isogenic cell models should result in more relevant and reliable hits. After proper validation, these potentially important disease targets may lead to discovery of new mechanisms or drugs.
Recent advances in stem cell research suggest that iPSCs may provide novel models of disease and new treatments for diseases. An isogenic iPSC line was established in the Ellerby lab through traditional means of HR on a human HD patient iPSC line. This isogenic line introduced a corrected donor strand for the CAG expansion and corrected the disease allele to a wild type allele ). The isogenic corrected line had the exact same genetic background as the patient, reducing the genetic variables that are present when one compares disease phenotypes across multiple different patients to matched wild type individuals. One of the first questions we addressed was whether we could take HD patient-derived iPSCs and, through genetic correction of the disease allele, reverse disease phenotypes. Interestingly, we did not detect phenotypes in the undifferentiated HD iPSCs but only observed disease phenotypes in the differentiated neural stem cell (NSC) state, and these phenotypes were reversible upon genetic correction of the patient mutation.
To understand the molecular basis for the CAG repeat expansion-dependent disease phenotypes in iPSCs and NSCs, RNA-Seq was performed comparing the isogenic corrected lines to HD iPSCs and HD NSCs. We observed that there were few phenotypic differences between HD and wild type iPSCs, but there were substantial differences-over 2000 dysregulated genes-in the NSCs. Some of the key pathways that were dysregulated included TGF-β, netrin-1 signaling and development of the striatum ( Fig. 3; Ring et al. 2015). Particularly important, our isogenic HD-iPSCs with corrected alleles identified the maturation or maintenance of medium spiny neurons (MSNs) as being dysregulated . We showed that the pathways or factors that were involved in this process were therapeutic targets for HD . A subsequent publication from another group emphasized the de-differentiation of MSNs or loss of MSN identity in HD is a major source of dysfunction (Langfelder et al. 2016). These pathways offer new options for therapeutic treatments and drug targets. Using genetic engineering, we generated an isogenic allelic HD iPSC series for HD modeling (CAG repeat of 21, 45, 72, 100). By creating additional isogenic lines, the contribution of the CAG expansion to the disease phenotypes can be elucidated from background variation; this information can help guide researchers towards additional treatment targets .
Besides directly modifying the disease gene, genome editing tools can also be used to engineer cells to facilitate disease research by making reporter cell lines. In an effort to investigate the roles of a gene encoding a sodium channel subunit in epilepsy, a tdTomato fluorescence protein gene cassette was inserted into iPSCs under a GABAergic neuron-specific promoter with CRISPR/Cas9. When differentiated into GABAergic neurons, these cells were red fluorescently labeled and could be readily followed for electrophysiological studies . Another example is in the peripheral neuropathy Charcot-Marie-Tooth disease, type 1A. With TALENs a bioluminescent reporter was integrated under the regulation of the disease causing gene pmp22, which allowed high throughput screening for reagents that can decrease expression of this gene (Inglese et al. 2014). In an effort to better track the recombination repair efficiency in HD cells, the Ellerby lab has designed a myc-tagged donor strand that, when incorporated into the cell, is detectable by both Western blot and PCR amplification; these methods are so sensitive that recombination efficiencies can be detected at levels as low as 5% (Fig. 4). For polyglutamine disease, it is also possible to detect the prevalence of the polyglutamine expansion through the use of specific antibodies, which detect the expanded polyglutamine region (Fig. 4; An et al. 2014). The ability to qualitatively assess how many cells have been corrected will increase the field's understanding of what may be a therapeutic level of correction for the disease. Having specific tags to monitor genetic correction rates and resulting phenotypic improvements will advance the field's understanding toward designing genetic correction and optimize treatment conditions.

Gene Editing In Vivo to Treat Genetic Diseases
With its extreme ease of use and targeting, the CRISPR system is being studied extensively with a goal of in vivo correction of genetic mutations. Recent advances have shown that it takes about 15 h for Cas9-mediated double strand breaks to be repaired; this is potentially due to Cas9 remaining bound to the DNA for an extended period of time and because it asymmetrically releases the target strand (Richardson et al. 2016). This asymmetric release of the strand has given researchers the ability to rationally design the donor strands in an effort to increase gene correction percentages; it also provides additional insight as to how to target and design the donor strands. The guide RNAs have also continued to evolve since the first characterization of the CRISPR system. Initially there were two components to the guide RNA, a crRNA and a gRNA, and these were able to be fused creating a simpler method in which the gRNA could be delivered already assembled. Multiple assembled gRNAs could be placed on the same plasmid, allowing for multiple gene targeting with minimal plasmids Hsu et al. 2013). A couple of new CRISPR variants have been characterized that offer even lower off-target binding levels and are smaller (Ran et al. 2015). Both of these new Fig. 4 (a) Use of myc tag in corrected donor plasmid allows for both insertion screening at the DNA level by PCR (left) and at the protein level by Western blot (right); red triangles indicate expected band size. (b) Use of 1C2 antibody screening with an expanded donor plasmid, a rapid method to optimize different gRNA combinations for homologous recombination efficiency characteristics may be useful in eventual patient treatment, as a smaller CRISPR protein could be more easily packaged for delivery and lower off-target binding increases the specificity of the CRISPR protein, restricting the effects to the target site.
The most exciting application of genome editing tools in human genetic diseases is genetic correction and normalization of those disease mutations. These have been done in cells. For example, in Myotonic dystrophy type 1, a genetic modification has been introduced by TALEN in a NSC model and this modification has shown some restoration of disease phenotypes (Xia et al. 2015). More encouragingly, genetic correction has been achieved in adult animals. Recently several groups published genetic correction in a mouse DMD model. Adeno-associated virusdelivered CRISPR/Cas9 was used to remove a mutation from the gene dmd. Partial phenotypic recovery has been observed in these studies Nelson et al. 2016;Tabebordbar et al. 2016). The use of CRISPR in vivo to ablate the rhodopsin gene carrying the dominant S334ter mutation in rats with severe autosomal dominant retinitis pigmentosa also highlights the use of genetic correction in disease (Bakondi et al. 2016). These proof-of-principle experiments may be the first steps towards overcoming many currently incurable genetic diseases. CRISPR technology is already being used in human cells and disease models with the eventual goal of patient treatment. A recent study conducted in China has even used CRISPR technology on non-viable human embryos (Liang et al. 2015). As this technology has advanced so rapidly, the scientific community has held a summit meeting to discuss the potential future of CRISPR technology, much in the same way the Asilomar Conference discussed recombinant DNA over 40 years ago (Baltimore et al. 2015;Berg et al. 1975a, b).
In HD, it is possible that a variety of CRISPR tools could prove beneficial for treatment. Previous studies have shown that a reduction in mutant HTT levels can ameliorate symptoms of the disease (Canals et al. 2004;Conforti et al. 2008;Zuccato et al. 2008). A recent study has shown reduction of mutant Huntingtin in cells by using TALE-ATFs (artificial transcription factors) to specifically target the mutant allele by targeting SNPs common on that allele. The TALE-ATF has a KRAB domain attached that represses transcription of the mutant Huntingtin allele . This technique has yet to be tried in Huntington model mice; however, previous studies have used ATFs to repress transcription in the brains of mice (Bailus et al. 2016). Another approach using CRISPR would involve increasing transcription of genes that could be neuroprotective in HD; BDNF could be a potential target for this type of therapy (Pollock et al. 2016). As screening studies are further refined using more genetically engineered isogenic cell lines, it will be possible to uncover additional gene regulation targets.
The ideal therapy for HD would involve gene replacement therapy, where the mutant allele would be replaced by a corrected donor allele. Using the CRISPR system, it will eventually be possible to do this correction in vivo. When designing the donor strand, it is possible to detect site-specific insertion by PCR if a small tag is added to the donor strand, allowing for optimization of different CRISPR components (Fig. 4). After design and condition optimization, there are still several issues that need to be addressed to develop CRISPR into an in vivo therapy. One area to examine is the immune response, as Cas9 is not an endogenous protein in mammals, although there are mouse models that constitutively express Cas9 from birth (Platt et al. 2014). Previous studies in humans with zinc finger proteins have shown minimal immune response. Cas9 is not endogenous to animals and may elicit an immune response if given over an extended period of time. A second major concern for gene correction in vivo is the delivery of the CRISPR system to the desired organ or tissue. For certain diseases, it may be possible to directly inject the organ and correct only a subpopulation of the cells; for other diseases, especially those that effect the brain, delivery is more difficult Yin et al. 2016). Direct injection into the brain is possible, and packaging the CRISPR system into an appropriately pseudotyped viral vector could allow for additional coverage beyond the injection point. The CRISPR system has been packaged into both AAV and lentivirus and used successfully in several mouse studies (Yin et al. 2016;Senis et al. 2014;Wang et al. 2015;Graham 2016). Nanoparticles and purified proteins are additional methods that have been used to successfully deliver CRISPR into cells and tissues Ramakrishna et al. 2014). Each of these delivery methods has advantages and disadvantages, but with additional optimization successful gene replacement therapy in vivo should be possible. Since early HD diagnosis is possible, genetic correction therapy could be performed during the asymptomatic stage, potentially preventing onset of the disease.

Conclusion
Genome engineering is providing neuroscientists with new methods to address critical questions in the field and offers the hope for new treatments of neurological genetic diseases. The application of genetic engineering to disease modeling is accelerating efforts to understand the molecular mechanism of these diseases and offers new approaches to identifying therapeutic targets and drugs. The recent advances in genetic engineering allow for better modeling and understanding the role of SNPs in diseases with complex genetic alterations. These new genomic engineering technologies, which precisely alter the genome, are already offering insights into the complexity of the nervous system, its normal function and alterations in disease. Eventually these genome engineering technologies may correct the disease allele in human patients (in vivo) before symptoms manifest, resulting in therapy at the DNA level.